Transgenic fish with tissue-specific expression

ABSTRACT

Disclosed are transgenic fish, and a method of making transgenic fish, which express transgenes in stable and predictable tissue- or developmentally-specific patterns. The transgenic fish contain transgene constructs with homologous expression sequences. Also disclosed are methods of using such transgenic fish. Such expression of transgenes allow the study of developmental processes, the relationship of cell lineages, the assessment of the effect of specific genes and compounds on the development or maintenance of specific tissues or cell lineages, and the maintenance of lines of fish bearing mutant genes.

BACKGROUND OF THE INVENTION

[0001] The disclosed invention is generally in the field of transgenicfish, and more specifically in the area of transgenic fish exhibitingtissue-specific expression of a transgene.

[0002] Transgenic technology has become an important tool for the studyof gene and promoter function (Hanahan, Science 246:1265-75 (1989);Jaenisch, Science 240:1468-74 (1988)). The ability to express, and studythe expression of, genes in whole animals can be facilitated by the useof transgenic animals. Transgenic technology is also a useful tool forcell lineage analysis and for transplantation experiments. Studies onpromoter function or lineage analysis generally require the expressionof a foreign reporter gene, such as the bacterial gene lacZ. Expressionof a reporter gene can allow the identification of tissues harboring atransgene. Typically, transgenic expression has been identified by insitu hybridization or by histochemistry in fixed animals. Unfortunately,the inability to easily detect transgene expression in living animalsseverely limits the utility of this technology, particularly for lineageanalysis.

[0003] An attractive paradigm for the understanding of gene expression,development, and genetics of animals, especially humans, is to studyless complex organisms, such as Escherichia coli, Drosophila, andCaenorhabditis. The hope is that understanding of these processes insimple organisms will have relevance to similar processes in mammals andhumans. The tradeoff is to accept the disadvantage that an experimentalorganism is only distantly related to humans for the advantage of easymanipulation, fast generation times, and more straightforwardinterpretation of results in the experimental organism. The disadvantageof this tradeoff can be lessened by using an organism that is as closelyrelated as possible to mammals while retaining as many of the advantagesof less complex organisms. The problem is to identify suitable organismsfor such studies, and, more importantly, to develop the tools necessaryto manipulate such organisms.

[0004] Some examples of cell determination in invertebrates have beenshown to occur in progressive waves that are regulated by sequentialcascades of transcription factors. Much less is known about suchprocesses in vertebrates. An integrated approach combiningembryological, genetic and molecular methods, such as that used to studydevelopment in Drosophila (for example, Ghysen et al., Genes & Dev7:723-33 (1993)), would facilitate the identification of the molecularmechanisms involved in specifying neuronal fates in vertebrates, butsuch an approach has been hampered by a lack of robust genetic andmolecular tools for use in vertebrates.

[0005] Transgenic technology has been applied to fish for variouspurposes. For example, transgenic technology has been applied to severalcommercially important varieties of fish, primarily in an attempt toimprove their cultivation. The use of transgenic technology in fish hasbeen reviewed by Moav, Israel J. of Zoology 40:441-466 (1994), Chen etal., Zoological Studies 34:215-234 (1995), and Iyengar et al.,Transgenic Res. 5:147-166 (1996).

[0006] Stuart et al., Development 103:403-412 (1988), describeintegration of foreign DNA into zebrafish, but no expression wasobserved. Stuart et al., Development 109:577-584 (1990), describeexpression of a transgene in zebrafish from SV40 and Rous sarcoma virustranscription regulatory sequences. Although expression was seen in apattern of tissues, the expression within a given tissue was variegated.Also, since Stuart et al. (1990) selected transgenics by expression andnot by the presence of the transgene, non-expressing transgenics wouldhave been missed by their analysis. Culp et al., Proc. Natl. Acad. Sci.USA 88:7953-7957 (1991), describe integration and germ line transmissionof DNA in zebrafish. Although the constructs used included the Roussarcoma virus LTR or SV40 enhancer promoter linked to a lacZ gene, noexpression was observed. Bayer and Campos-Ortega, Development115:421-426 (1992), describe integration and expression in zebrafish ofa lacZ transgene having a minimal promoter (a mouse heat shock promoter)but no upstream regulatory sequences. The expression obtained dependedon the site of integration indicating that endogenous sequences at thesite of integration of the fish were responsible for expression.Westerfield et al., Genes & Development 6:591-598 (1992), describetransient expression in zebrafish of β-galactosidase from mouse andhuman Hox gene promoters. Lin et al., Dev. Biology 161:77-83 (1994),describe transgenic expression of lacZ in living zebrafish embryos. Thetransgene linked the enhancer-promoter of the Xenopus elongation factor1α gene with the lacZ coding sequence. Different lines of transgenicfish exhibited different patterns of expression, indicating that thesite of integration may be affecting the pattern of expression.Amsterdam et al., Dev. Biology 171:123-129 (1995), and Amsterdam et al.,Gene 173:99-103 (1996), describe transgenic expression of greenfluorescent protein (GFP) in zebrafish. The transgene linked theenhancer-promoter of the Xenopus elongation factor 1α gene with the GFPcoding sequence. As in Lin et al., Dev. Biology 161:77-83 (1994),different lines of transgenic fish exhibited different patterns ofexpression, indicating that the site of integration may be affecting thepattern of expression. Although some of the systems described aboveexhibited patterned expression, none resulted in the transmission ofstable tissue-specific expression of a transgene in zebrafish.

[0007] It is an object of the present invention to provide transgenicfish having tissue- and developmentally-specific expression oftransgenes.

[0008] It is another object of the present invention to provide a methodof making transgenic fish having tissue- and developmentally-specificexpression of transgenes.

[0009] It is another object of the present invention to provide a methodof identifying compounds that affect expression of fish genes ofinterest.

[0010] It is another object of the present invention to provide a methodof identifying the pattern of expression of fish genes of interest.

[0011] It is another object of the present invention to provide a methodof identifying genes that affect expression of fish genes of interest.

[0012] It is another object of the present invention to provide a methodof genetically marking mutant fish genes.

[0013] It is another object of the present invention to provide a methodof identifying fish that have inherited a mutant gene.

[0014] It is another object of the present invention to provide a methodof identifying enhancers and other regulatory sequences in fish.

[0015] It is another object of the present invention to provide aconstruct that exhibits tissue- and developmentally-specific expressionin fish.

BRIEF SUMMARY OF THE INVENTION

[0016] Disclosed are transgenic fish, and a method of making transgenicfish, which express transgenes in stable and predictable tissue- ordevelopmentally-specific patterns. The transgenic fish contain transgeneconstructs with homologous expression sequences. Also disclosed aremethods of using such transgenic fish. Such expression of transgenesallow the study of developmental processes, the relationship of celllineages, the assessment of the effect of specific genes and compoundson the development or maintenance of specific tissues or cell lineages,and the maintenance of lines of fish bearing mutant genes.

BRIEF DESCRIPTION OF THE DRAWINGS

[0017]FIG. 1A shows the nucleotide sequence at the exon/intron junctionsof the zebrafish GATA-1 locus. The conserved splice sequences areunderlined and the intron sequences are listed within parentheses. Theamino acids encoded by the exon regions flanking the introns are shownbeneath the nucleotide sequence. The upstream splice junction nucleotidesequences are SEQ ID NO:6 (IVS-1), SEQ ID NO:7 (IVS-2), SEQ ID NO:8(IVS-3), and SEQ ID NO:9 (IVS4). The downstream splice junctionnucleotide sequences are SEQ ID NO:10 (IVS-1), SEQ ID NO:11 (IVS-2), SEQID NO:12 (IVS-3), and SEQ ID NO:13 (IVS-4). The amino acid sequencesspanning the introns are SEQ ID NO:14 (IVS-1), SEQ ID NO:15 (IVS-2), SEQID NO:16 (IVS-3), and SEQ ID NO:17 (IVS-4).

[0018]FIG. 1B is a diagram of the structure of the zebrafish GATA-1locus. Exon regions are filled. Intron regions are unfilled. The tallfilled boxes represent the coding regions. The arrow indicates theputative transcription start site. EcoRI endonuclease sites are labeledE. BglII endonuclease sites are labeled G. BamHI endonuclease sites arelabeled B.

[0019]FIG. 2 is a diagram of the structures of three GATA-1/GFPtransgene constructs used to make transgenic fish. The filled region tothe right of the GM2 box in each construct represents the 5.4 kb or 5.6kb region of the GATA-1 locus upstream of the GATA-1 coding region. Thebox labeled GM2 represents a sequence encoding the modified greenfluorescent protein. The thin angled lines in constructs (1) and (3)represent vector or linking sequences. EcoRI endonuclease sites arelabeled E. BglII endonuclease sites are labeled G. BamHI endonucleasesites are labeled B. In construct (3), the BamHI/EcoRI fragment on theright side is the downstream BamHI/EcoRI fragment of the GATA-1 locus.

[0020]FIG. 3 is a diagram of the structures of GATA-2/GFP transgeneconstructs for analyzing the expression sequences of the GATA-2 gene.The line represents all or upstream deleted portions of a 7.3 kb regionupstream of the translation start site in the zebrafish GATA-2 gene. Thehatched box represents a segment encoding the modified GFP and includinga SV40 polyadenylation signal. Tick marks labeled P, Sa, A, C, and Scindicates restriction sites PstI, SacI, AatII, ClaI and ScaI,respectively, in the 7.3 kb region.

[0021]FIG. 4 is a diagram of the structures of GATA-2/GFP transgeneconstructs for analyzing the expression sequences of the GATA-2 gene.The thick open box represents a 1116 bp fragment of the upstream regionof the GATA-2 gene required for neuron-specific expression. The thinopen box represents segments of the upstream region of the GATA-2 geneproximal to the transcription start site. The thick line represents theminimal promoter of the Xenopus elongation factor 1α gene. The hatchedbox represents a segment encoding the modified GFP and including a SV40polyadenylation signal.

[0022]FIG. 5 is a graph of the percent of embryos microinjected with thetransgene constructs shown in FIG. 4 that expressed GFP in neurons.

[0023]FIG. 6 is a graph of the percent of embryos microinjected withtransgene constructs that expressed GFP in neurons. The transgeneconstructs were nsP5-GM2 and truncated forms of nsP5-GM2.

[0024]FIG. 7 is a graph of the percent of embryos microinjected withtransgene constructs that expressed GFP in neurons. The transgeneconstructs were mutant forms of the ns3831 truncation of nsP5-GM2.

DETAILED DESCRIPTION OF THE INVENTION

[0025] Disclosed are transgenic fish, and a method of making transgenicfish, which express transgenes in stable and predictable tissue- ordevelopmentally-specific patterns. Also disclosed are methods of usingsuch transgenic fish. Such expression of transgenes allow the study ofdevelopmental processes, the relationship of cell lineages, theassessment of the effect of specific genes and compounds on thedevelopment or maintenance of specific tissues or cell lineages, and themaintenance of lines of fish bearing mutant genes. The disclosedtransgenic fish are characterized by homologous expression sequences inan exogenous construct introduced into the fish or a progenitor of thefish.

[0026] As used herein, transgenic fish refers to fish, or progeny of afish, into which an exogenous construct has been introduced. A fish intowhich a construct has been introduced includes fish which have developedfrom embryonic cells into which the construct has been introduced. Asused herein, an exogenous construct is a nucleic acid that isartificially introduced, or was originally artificially introduced, intoan animal. The term artificial introduction is intended to excludeintroduction of a construct through normal reproduction or geneticcrosses. That is, the original introduction of a gene or trait into aline or strain of animal by cross breeding is intended to be excluded.However, fish produced by transfer, through normal breeding, of anexogenous construct (that is, a construct that was originallyartificially introduced) from a fish containing the construct areconsidered to contain an exogenous construct. Such fish are progeny offish into which the exogenous construct has been introduced. As usedherein, progeny of a fish are any fish which are descended from the fishby sexual reproduction or cloning, and from which genetic material hasbeen inherited. In this context, cloning refers to production of agenetically identical fish from DNA, a cell, or cells of the fish. Thefish from which another fish is descended is referred to as a progenitorfish. As used herein, development of a fish from a cell or cells(embryonic cells, for example), or development of a cell or cells into afish, refers to the developmental process by which fertilized egg cellsor embryonic cells (and their progeny) grow, divide, and differentiateto form an adult fish.

[0027] The examples illustrate the manner in which transgenic fishexhibiting cell lineage-specific expression can be made and used. Thetransgenic fish described in the examples, and the transgene constructsused, are particularly useful for early detection of fish expressing thetransgene, the study of erythroid cell development, the study ofneuronal development, and as a reporter for genetically linked mutantgenes.

[0028] Tissue-, developmental stage-, or cell lineage-specificexpression of a reporter gene from a regulated promoter in the disclosedtransgenic fish can be useful for identifying the pattern of expressionof the gene from which the promoter is derived. Such expression can alsoallow study of the pattern of development of a cell lineage. As usedherein, tissue-specific expression refers to expression substantiallylimited to specific tissue types. Tissue-specific expression is notnecessarily limited to expression in a single tissue but includesexpression limited to one or more specific tissues. As used herein,developmental stage-specific expression refers to expressionsubstantially limited to specific developmental stages. Developmentalstage-specific expression is not necessarily limited to expression at asingle developmental stage but includes expression limited to one ormore specific developmental stage. As used herein, cell lineage-specificexpression refers to expression substantially limited to specific celllineages. As used herein, cell lineage refers to a group of cells thatare descended from a particular cell or group of cells. In development,for example, newly specialized or differentiated cells can give rise tocell lineages. Cell lineage-specific expression is not necessarilylimited to expression in a single cell lineage but includes expressionlimited to one or more specific cell lineages. All of these types ofspecific expression can operate in the same gene. For example, adevelopmentally regulated gene can be expressed at both specificdevelopmental stages and be limited to specific tissues. As used herein,the pattern of expression of a gene refers to the tissues, developmentalstages, cell lineages, or combinations of these in or at which the geneis expressed.

[0029] 1. Transgene Constructs

[0030] Transgene constructs are the genetic material that is introducedinto fish to produce a transgenic fish. Such constructs are artificiallyintroduced into fish. The manner of introduction, and, often, thestructure of a transgene construct, render such a transgene construct anexogenous construct. Although a transgene construct can be made up ofany nucleic acid sequences, for use in the disclosed transgenic fish itis preferred that the transgene constructs combine expression sequencesoperably linked to a sequence encoding an expression product. Thetransgenic construct will also preferably include other components thataid expression, stability or integration of the construct into thegenome of a fish. As used herein, components of a transgene constructreferred to as being operably linked or operatively linked refer tocomponents being so connected as to allow them to function together fortheir intended purpose. For example, a promoter and a coding region areoperably linked if the promoter can function to result in transcriptionof the coding region.

[0031] A. Expression Sequences

[0032] Expression sequences are used in the disclosed transgeneconstructs to mediate expression of an expression product encoded by theconstruct. As used herein, expression sequences include promoters,upstream elements, enhancers, and response elements. It is preferredthat the expression sequences used in the disclosed constructs behomologous expression sequences. As used herein, in reference tocomponents of transgene constructs used in the disclosed transgenicfish, homologous indicates that the component is native to or derivedfrom the species or type of fish involved. Conversely, heterologousindicates that the component is neither native to nor derived from thespecies or type of fish involved.

[0033] Two large scale chemical mutagenesis screens recently producedthousands of zebrafish mutants affecting development (Driever et al.,Development 123:37-46 (1996); Haffter et al., Development 123:1-36(1996)). Such genes and their expression patterns are of significantinterest for understanding the developmental process. Therefore,expression sequences from these genes are preferred for use asexpression sequences in the disclosed constructs.

[0034] As used herein, expression sequences are divided into two mainclasses, promoters and enhancers. A promoter is generally a sequence orsequences of DNA that function when in a relatively fixed location inregard to the transcription start site. A promoter contains coreelements required for basic interaction of RNA polymerase andtranscription factors, and may contain upstream elements and responseelements. Enhancer generally refers to a sequence of DNA that functionsat no fixed distance from the transcription start site and can be ineither orientation. Enhancers function to increase transcription fromnearby promoters. Enhancers also often contain response elements thatmediate the regulation of transcription. Promoters can also containresponse elements that mediate the regulation of transcription.

[0035] Enhancers often determine the,regulation of expression of a gene.This effect has been seen in so-called enhancer trap constructs whereintroduction of a construct containing a reporter gene operably linkedto a promoter is expressed only when the construct inserts into thedomain of an enhancer (O'Kane and Gehring, Proc. Natl. Acad. Sci. USA84:9123-9127 (1987), Allen et al., Nature 333:852-855 (1988), Kothary etal., Nature 335:435-437 (1988), Gossler et al., Science 244:463-465(1989)). In such cases, the expression of the construct is regulatedaccording to the pattern of the newly associated enhancer. Transgenicconstructs having only a minimal promoter can be used in the disclosedtransgenic fish to identify enhancers.

[0036] Preferred enhancers for use in the disclosed transgenic fish arethose that mediate tissue- or cell lineage-specific expression. Morepreferred are homologous enhancers that mediate tissue- or celllineage-specific expression. Still more preferred are enhancers fromfish GATA-1 and GATA-2 genes. Most preferred are enhancers fromzebrafish GATA-1 and GATA-2 genes.

[0037] For expression of encoded peptides or proteins, a transgeneconstruct also needs sequences that, when transcribed into RNA, mediatetranslation of the encoded expression products. Such sequences aregenerally found in the 5′ untranslated region of transcribed RNA. Thisregion corresponds to the region on the construct between thetranscription initiation site and the translation initiation site (thatis, the initiation codon). The 5′ untranslated region of a construct canbe derived from the 5′ untranslated region normally associated with thepromoter used in the construct, the 5′ untranslated region normallyassociated with the sequence encoding the expression product, the 5′untranslated region of a gene unrelated to the promoter or sequenceencoding the expression product, or a hybrid of these 5′ untranslatedregions. Preferably, the 5′ untranslated region is homologous to thefish into which the construct is to be introduced. Preferred 5′untranslated regions are those normally associated with the promoterused.

[0038] B. Expression Products

[0039] Transgene constructs for use in the disclosed transgenic fish canencode any desired expression product, including peptides, proteins, andRNA. Expression products can include reporter proteins (for detectionand quantitation of expression), and products having a biological effecton cells in which they are expressed (by, for example, adding a newenzymatic activity to the cell, or preventing expression of a gene).Many such expression products are known or can be identified.

[0040] Reporter Proteins

[0041] As used herein, a reporter protein is any protein that can bespecifically detected when expressed. Reporter proteins are useful fordetecting or quantitating expression from expression sequences. Forexample, operatively linking nucleotide sequence encoding a reporterprotein to a tissue specific expression sequences allows one tocarefully study lineage development. In such studies, the reporterprotein serves as a marker for monitoring developmental processes, suchas cell migration. Many reporter proteins are known and have been usedfor similar purposes in other organisms. These include enzymes, such asβ-galactosidase, luciferase, and alkaline phosphatase, that can producespecific detectable products, and proteins that can be directlydetected. Virtually any protein can be directly detected by using, forexample, specific antibodies to the protein. A preferred reporterprotein that can be directly detected is the green fluorescent protein(GFP). GFP, from the jellyfish Aequorea victoria, produces fluorescenceupon exposure to ultraviolet light without the addition of a substrate(Chalfie et al., Science 263:802-5 (1994)). Recently, a number ofmodified GFPs have been created that generate as much as 50-fold greaterfluorescence than does wild type GFP under standard conditions (Cormacket al., Gene 173:33-8 (1996); Zolotukhin et al., J. Virol 70:4646-54(1996)). This level of fluorescence allows the detection of low levelsof tissue specific expression in a living transgenic animal.

[0042] The use of reporter proteins that, like GFP, are directlydetectable without requiring the addition of exogenous factors arepreferred for detecting or assessing gene expression during zebrafishembryonic development. A transgenic zebrafish embryo, carrying aconstruct encoding a reporter protein and a tissue-specific expressionsequences, can provide a rapid real time in vivo system for analyzingspatial and temporal expression patterns of developmentally regulatedgenes.

[0043] C. Other Construct Sequences

[0044] The disclosed transgene constructs preferably include othersequences which improve expression from, or stability of, the construct.For example, including a polyadenylation signal on the constructsencoding a protein ensures that transcripts from the transgene will beprocessed and transported as mRNA. The identification and use ofpolyadenylation signals in expression constructs is well established. Itis preferred that homologous polyadenylation signals be used in thetransgene constructs.

[0045] It is also known that the presence of introns in primarytranscripts can increase expression, possibly by causing the transcriptto enter the processing and transport system for mRNA. It is preferredthat an intron, if used, be included in the 5′ untranslated region orthe 3′ untranslated region of the transgene transcript. It is alsopreferred that the intron be homologous to the fish used, and morepreferably homologous to the expression sequences used (that is, thatthe intron be from the same gene that some or all of the expressionsequences are from). The use and importance of these and othercomponents useful for transgene constructs are discussed in Palmiter etal., Proc. Natl. Acad. Sci. USA 88:478-482 (1991); Sippel et al., “TheRegulatory Domain Organization of Eukaryotic Genomes: Implications ForStable Gene Transfer” in Transgenic Animals (Grosveld and Kollias, eds.,Academic Press, 1992), pages 1-26; Kollias and Grosveld, “The Study ofGene Regulation in Transgenic Mice” in Transgenic Animals (Grosveld andKollias, eds, Academic Press, 1992), pages 79-98; and Clark et al.,Phil. Trans. R. Soc. Lond. B. 339:225-232 (1993).

[0046] The disclosed constructs are preferably integrated into thegenome of the fish. However, the disclosed transgene construct can alsobe constructed as an artificial chromosome. Such artificial chromosomescontaining more that 200 kb have been used in several organisms.Artificial chromosomes can be used to introduce very large transgeneconstructs into fish. This technology is useful since it can allowfaithful recapitulation of the expression pattern of genes that haveregulatory elements that lie many kilobases from coding sequences.

[0047] 2. Fish

[0048] The disclosed constructs and methods can be used with any type offish. As used herein, fish refers to any member of the classescollectively referred to as pisces. It is preferred that fish belongingto species and varieties of fish of commercial or scientific interest beused. Such fish include salmon, trout, tuna, halibut, catfish,zebrafish, medaka, carp, tilapia, goldfish, and loach.

[0049] The most preferred fish for use with the disclosed constructs andmethods is zebrafish, Danio rerio. Zebrafish are an increasingly popularexperimental animal since they have many of the advantages of popularinvertebrate experimental organisms, and include the additionaladvantage that they are vertebrates. Another significant advantage ofzebrafish for the study of development and cell lineages is that, likeCaenorhabditis, they are largely transparent (Kimmel, Trends Genet5:283-8 (1989)). The generation of thousands of zebrafish mutants(Driever et al., Development 123:37-46 (1996); Haffter et al.,Development 123:1-36 (1996)) provides abundant raw material fortransgenic study of these animals. General zebrafish care andmaintenance is described by Streisinger, Natl. Cancer Inst. Monogr.65:53-58 (1984).

[0050] Zebrafish embryos are easily accessible and nearly transparent.Given these characteristics, a transgenic zebrafish embryo, carrying aconstruct encoding a reporter protein and tissue-specific expressionsequences, can provide a rapid real time in vivo system for analyzingspatial and temporal expression patterns of developmentally regulatedgenes. In addition, embryonic development of the zebrafish is extremelyrapid. In 24 hours an embryo develops rudiments of all the major organs,including a functional heart and circulating blood cells (Kimmel, TrendsGenet 5:283-8 (1989)). Other fish with some or all of the same desirablecharacteristics are also preferred.

[0051] 3. Production of Transgenic Fish

[0052] The disclosed transgenic fish are produced by introducing atransgene construct into cells of a fish, preferably embryonic cells,and most preferably in a single cell embryo. Where the transgeneconstruct is introduced into embryonic cells, the transgenic fish isobtained by allowing the embryonic cell or cells to develop into a fish.Introduction of constructs into embryonic cells of fish, and subsequentdevelopment of the fish, are simplified by the fact that embryos developoutside of the parent fish in most fish species.

[0053] The disclosed transgene constructs can be introduced intoembryonic fish cells using any suitable technique. Many techniques forsuch introduction of exogenous genetic material have been demonstratedin fish and other animals. These include microinjection (described by,for example, Culp et al. (1991)), electroporation (described by, forexample, Inoue et al., Cell. Differ. Develop. 29:123-128 (1990); Mülleret al., FEBS Lett. 324:27-32 (1993); Murakami et al., J. Biotechnol.34:35-42 (1994); Müller et al., Mol. Mar. Biol. Biotechnol. 1:276-281(1992); and Symonds et al., Aquaculture 119:313-327 (1994)), particlegun bombardment (Zelenin et al., FEBS Lett. 287:118-120 (1991)), and theuse of liposomes (Szelei et al., Transgenic Res. 3:116-119 (1994)).Microinjection is preferred. The preferred method for introduction oftransgene constructs into fish embryonic cells by microinjection isdescribed in the examples.

[0054] Embryos or embryonic cells can generally be obtained bycollecting eggs immediately after they are laid. Depending on the typeof fish, it is generally preferred that the eggs be fertilized prior toor at the time of collection. This is preferably accomplished by placinga male and female fish together in a tank that allows egg collectionunder conditions that stimulate mating. After collecting eggs, it ispreferred that the embryo be exposed for introduction of geneticmaterial by removing the chorion. This can be done manually or,preferably, by using a protease such as pronase. A preferred techniquefor collecting zebrafish eggs and preparing them for microinjection isdescribed in the examples. A fertilized egg cell prior to the first celldivision is considered a one cell embryo, and the fertilized egg cell isthus considered an embryonic cell.

[0055] After introduction of the transgene construct the embryo isallowed to develop into a fish. This generally need involve no more thanincubating the embryos under the same conditions used for incubation ofeggs. However, the embryonic cells can also be incubated briefly in anisotonic buffer. If appropriate, expression of an introduced transgeneconstruct can be observed during development of the embryo.

[0056] Fish harboring a transgene can be identified by any suitablemeans. For example, the genome of potential transgenic fish can beprobed for the presence of construct sequences. To identify transgenicfish actually expressing the transgene, the presence of an expressionproduct can be assayed. Several techniques for such identification areknown and used for transgenic animals and most can be applied totransgenic fish. Probing of potential or actual transgenic fish fornucleic acid sequences present in or characteristic of a transgeneconstruct is preferably accomplished by Southern or Northern blotting.Also preferred is detection using polymerase chain reaction (PCR) orother sequence-specific nucleic acid amplification techniques. Preferredtechniques for identifying transgenic zebrafish are described in theexamples.

[0057] 4. Identifying the Pattern of Expression of Fish Genes

[0058] Identifying the pattern of expression in the disclosed transgenicfish can be accomplished by measuring or identifying expression of thetransgene in different tissues (tissue-specific expression), atdifferent times during development (developmentally regulated expressionor developmental stage-specific expression), in different cell lineages(cell lineage-specific expression). These assessments can also becombined by, for example, measuring expression (and observing changes,if any) in a cell lineage during development. The nature of theexpression product to be detected can have an effect on the suitabilityof some of these analyses. On one level, different tissues of a fish canbe dissected and expression can be assayed in the separate tissuesamples. Such an assessment can be performed when using almost anyexpression product. This technique is commonly used in transgenicanimals and is useful for assessing tissue-specific expression.

[0059] This technique can also be used to assess expression during thecourse of development by assaying for the expression product atdifferent developmental stages. Where detection of the expressionproduct requires fixing of the sample or other treatments that destroyor kill the developing embryo or fish, multiple embryos must be used.This is only practical where the expression pattern in different embryosis expected to be the same or similar. This will be the case when usingthe disclosed transgenic fish having stable and predictable expression.

[0060] A more preferred way of assessing the pattern of expression of atransgene during development is to use an expression product that can bedetected in living embryos and animals. A preferred expression productfor this purpose is the green fluorescent protein. A preferred form ofGFP and a preferred technique for measuring the presence of GFP inliving fish is described in the examples.

[0061] Expression products of the disclosed transgene constructs can bedetected using any appropriate method. Many means of detectingexpression products are known and can be applied to the detection ofexpression products in transgenic fish. For example, RNA can be detectedusing any of numerous nucleic acid detection techniques. Some of thesedetection methods as applied to transgenic fish are described in theexamples. The use of reporter proteins as the expression product ispreferred since such proteins are selected based on their detectability.The detection of several useful reporter proteins is described byIyengar et al. (1996).

[0062] In zebrafish, the nervous system and other organ rudiments appearwithin 24 hours of fertilization. Since the nearly transparent zebrafishembryo develops outside its mother, the origin and migration of lineageprogenitor cells can be monitored by following expression of anexpression product in transgenic fish. In addition, the regulation of aspecific gene can be studied in these fish.

[0063] Using zebrafish promoters that drive expression in specifictissues, a number of transgenic zebrafish lines can be generated thatexpress a reporter protein in each of the major tissues including thenotochord, the nervous system, the brain, the thymus, and in othertissues (see Table 1). Other important lineages for which specificexpression can be obtained include neutral crest, germ cells, liver,gut, and kidney. Additional tissue specific transgenic fish can begenerated by using “enhancer trap” constructs to identify expressionsequences in fish. TABLE 1 Source of Expression Sequences Tissues/Celllineages GATA-1 Erythroid progenitor GATA-2 Hematopoietic stem cells/CNSTinman Heart Rag-1 T and B Cells Globin Mature red blood cells MEFMuscle progenitors Goosecoid Dorsal organizer SCL-1 Hematopoietic stemcells Rbtn-2 Hematopoietic stem cells No-tail Notochord Flk-1 Vascularendothelia Eve-1 Ventral/posterior cells Ikaros Early lymphoidprogenitors Pdx-1 Pancreas Islet-1 Motoneuron Shh Multi-tissueinduction/Left-right symmetry Twist Axial mesoderm/Left-right symmetryKrox20 Brain BMP4 Ventral mesoderm induction

[0064] 5. Identifying Compounds that Affect Expression of Fish Genes

[0065] For many genes, and especially for genes involved indevelopmental processes, it would be useful to identify compounds thataffect expression of the genes. The disclosed transgenic fish can beexposed to compounds to assess the effect of the compound on theexpression of a gene of interest. For example, test compounds can beadministered to transgenic fish harboring an exogenous constructcontaining the expression sequences of a fish gene of interest operablylinked to a sequence encoding a reporter protein. By comparing theexpression of the reporter protein in fish exposed to a test compound tothose that are not exposed, the effect of the compound on the expressionof the gene from which the expression sequences are derived can beassessed.

[0066] 6. Identifying Genes that Affect Expression of Fish Genes

[0067] Numerous mutants have been generated and characterized inzebrafish which affect most developmental processes. The disclosedtransgenic fish can be used in combination with these and othermutations to assess the effect of a mutant gene on the expression of agene of interest. For example, mutations can be introduced into strainsof transgenic fish harboring an exogenous construct containing theexpression sequences of a fish gene of interest operably linked to asequence encoding a reporter protein. By comparing the expression of thereporter protein in fish with a mutation to those without the mutation,the effect of the mutation on the expression of the gene from which theexpression sequences are derived can be assessed.

[0068] The effect of such mutations on specific developmental processesand on the growth and development of specific cell lineages can also beassessed using the disclosed transgenic fish expressing a reporterprotein in specific cell lineages or at specific developmental stages.

[0069] 7. Genetically Marking Mutant Fish Genes

[0070] The disclosed transgene constructs can be used to geneticallymark mutant genes or chromosome regions. For example, in zebrafish,recent chemical mutagenesis screens have generated more than onethousand different mutants with defects in most developmental processes.If fish carrying a mutation generated in these screens could be moreeasily identified, a lot of time and labor would be saved. One way topromote rapid identification of fish carrying mutations would be theestablishment of balancer chromosomes that carry markers that can beeasily identified in living fish. This technology has greatlyfacilitated the task of identification and maintenance of mutant stocksin Drosophila (Ashburner, Drosophila, A Laboratory Manual (Cold SpringHarbor Laboratory Press, Cold Spring Harbor, N.Y., 1989); Lindsey andZimm, The Genome of Drosophila melanogaster (Academic Press, San Diego,Calif., 1995)). As used herein, genetically marking a gene or chromosomeregion refers to genetically linking a reporter gene to the gene orchromosome region. Genetic linkage between two genetic elements (such asgenes) refers to the elements being in sufficiently close proximity on achromosome that they do not segregate from each other at random ingenetic crosses. The closer the genetic linkage, the more likely thatthe two elements will segregate together. For genetic marking, it ispreferred that the transgene construct segregate with the gene orchromosomal region of interest more than 60% of the time, it is morepreferred that the transgene construct segregate with the gene orchromosomal region of interest more than 70% of the time, it is stillmore preferred that the transgene construct segregate with the gene orchromosomal region of interest more than 80% of the time, it is stillmore preferred that the .transgene construct segregate with the gene orchromosomal region of interest more than 90% of the time, and it is mostpreferred that the transgene construct segregate with the gene orchromosomal region of interest more than 95% of the time.

[0071] Example 1 shows that living transgenic fish carrying insertionsof a transgene, in which the zebrafish GATA-1 promoter has been ligatedto the green fluorescent protein (GFP) reporter gene, can be identifiedby simple observation of GFP expression in blood cells. As inDrosophila, zebrafish chromosomal recombination occurs at asignificantly lower rate during spermatogenesis than it does duringoogenesis. Therefore, a transgene insertion that maps near a chemicallyinduced mutant gene can be crossed into the mutant chromosome throughoogenesis and will then remain linked to the mutation in male fishthrough many generations. This procedure will allow the identificationof progeny harboring the mutant gene by simple observation of GFP inblood cells.

[0072] In the case of zebrafish, 200 lines carrying the GATA-1/GFPtransgene (or another reporter construct), randomly inserted throughoutthe zebrafish genome should result in an average of 8 insertions in eachof the 25 zebrafish chromosomes. This is possible since expression fromthe disclosed constructs is not limited by effects of the site ofinsertion and the site of integration is not limited. The insertionsites can be mapped and then crossed through oogenesis into zebrafishlines that carry a mutation that maps nearby. Once established, mutantstrains that carry balancer chromosomes can be maintained in male fish.

[0073] Although it is preferred that mutant genes be genetically marked,any gene of interest or any chromosome region can be marked, and themaintenance and inheritance of the gene can be monitored, in a similarmanner. As used herein, an identified mutant gene is a mutant gene thatis known or that has been identified, in contrast to a mutant gene whichmay be present in an organism but which has not been recognized.

[0074] Genetically mapping of mutant genes or transgenes in fish can beperformed using established techniques and the principles of geneticcrosses. Generally, mapping involves determining the linkagerelationships between genetic elements by assessing whether, and to whatextent two or more genetic elements tend to cosegregate in geneticcrosses.

[0075] 8. Identifying Fish that have Inherited a Mutant Gene

[0076] Mutant fish in which the mutant gene is marked with an exogenousconstruct expressing a reporter protein simplify the identification ofprogeny fish that carry the mutant gene. For example, after a cross,progeny fish can be screened for expression of the reporter protein.Those that express the reporter protein are very likely to haveinherited the mutant gene which is genetically linked. Those progenyfish not expressing the reporter protein can be excluded from furtheranalysis.

[0077] Although recombination during gametogenesis may result insegregation of the exogenous construct from the mutant gene, this willhappen only rarely. Initial screening for fish expressing the reporterprotein will still ensure that the majority of such progeny fish willcarry the mutant gene. Confirmation of the mutant can be established bysubsequent direct testing for the mutant gene.

[0078] 9. Identifying and Cloning Regulatory Sequences from Fish

[0079] The disclosed constructs can also be used as “enhancer traps” togenerate transgenic fish that exhibit tissue-specific expression of anexpression product. Transgenic animals carrying enhancer trap constructsoften exhibit tissue-specific expression patterns due to the effects ofendogenous enhancer elements that lie near the position of integration.

[0080] Once it is determined that the exogenous construct is operablylinked to an enhancer or other regulatory sequence in a fish, theregulatory element can be isolated by re-cloning the transgeneconstruct. Many general cloning techniques can be used for this purpose.A preferred method of cloning regulatory sequences that have becomelinked to a transgene construct in a fish is to isolate and cleavegenomic DNA from the fish with a restriction enzyme that does not cleavethe exogenous construct. The resulting fragments can be cloned in vitroand screened for the presence of characteristic transgene sequences. Asearch for enhancers in zebrafish using a transgene construct havingonly a promoter operably linked to a sequence encoding a reporterprotein has generated a transgenic line that expresses GFP exclusivelyin hatching gland cells.

[0081] A similar procedure can be followed to identify promoters. Inthis case, a “promoter probe” construct, which lacks any expressionsequences, is used. Only if the construct is inserted into the genomedownstream of expression sequences will the expression product encodedby the construct be expressed.

[0082] 10. Identifying Promoters and Enhancers in Cloned ExpressionSequences

[0083] The linked genomic sequences of clones identified as containingexpression sequences, or any other nucleic acid segment containingexpression sequences, can then be characterized to identify potentialand actual regulatory sequences. For example, a deletion series of apositive clone can be tested for expression in transgenic fish.Sequences essential for expression, or for a pattern of expression, areidentified as those which, when deleted from a construct, no longersupport expression or the pattern of expression. The ability to assessthe pattern of expression of a transgene in fish using the disclosedtransgenic fish and methods makes it possible to identify the elementsin the regulatory sequences of a fish gene that are responsible for thepattern of expression. The disclosed transgenic fish, since they can beproduced routinely and consistently, allow meaningful comparison of theexpression of different deletion constructs in separate fish.

[0084] An example of the power of this capability is described inExample 2. Application of this system to the study of the GATA-2promoter has led to identification of enhancer regions that facilitategene expression specifically in hematopoietic precursors, the envelopinglayer (EVL) and the central nervous system (CNS). Through site-directedmutagenesis, it has been discovered that the DNA sequence CCCTCCT isessential for the neuron-specific activity of the GATA-2 promoter. Thisis described in Example 2.

[0085] 11. Isolating Cells Expressing an Expression Product

[0086] Using cell sorting based on the presence of an expressionproduct, pure populations of cells expressing a transgene construct canbe isolated from other cells. Where the transgene construct is expressedin particular cell lineages or tissues, this can allow the purificationof cells from that particular lineage. These cells can be used in avariety of in vitro studies. For instance, these pure cell populationscan provide mRNA for differential display or subtractive screens foridentifying genes expressed in that cell lineage. Progenitor cells ofspecific tissue could also be isolated. Establishing such cells intissue culture would allow the growth factor needs of these cells to bedetermined. Such knowledge could be used to culture non-transgenic formsof the same cells or related cells in other organisms.

[0087] Cell sorting is preferably facilitated by using a constructexpressing a fluorescent protein or an enzyme producing a fluorescentproduct. This allows fluorescence activated cell sorting (FACS). Apreferred fluorescent protein for this purpose is the green fluorescentprotein. The ability to generate transgenic fish expressing GFP in atissue- and cell lineage-specific manner for different cell typesindicates that transgenic fish that express GFP in other types oftissues can be generated in a straightforward manner. The disclosed FACSapproach can therefore be used as a general method for isolating purecell populations from developing embryos based solely on gene expressionpatterns. This method for isolation of specific cell lineages ispreferably performed using constructs linking GFP with the expressionsequences of genes identified as being involved in development. Numeroussuch genes have been or can be identified as mutants that affectdevelopment. Cells isolated in this manner should be useful intransplantation experiments.

[0088] Publications cited herein and the material for which they arecited are specifically incorporated by reference.

[0089] Those skilled in the art will recognize, or be able to ascertainusing no more than routine experimentation, many equivalents to thespecific embodiments of the invention described herein. Such equivalentsare intended to be encompassed by the following claims.

EXAMPLES Example 1

[0090] Tissue-specific Expression and Germline Transmission of aTransgene in Zebrafish.

[0091] In this example, DNA constructs containing the putative zebrafishexpression sequences of GATA-1, an erythroid-specific transcriptionfactor, operatively linked to a sequence encoding the green fluorescentprotein (GFP), were microinjected into single-cell zebrafish embryos.

[0092] GATA-1, an early marker of the erythroid lineage, was initiallyidentified through its effects upon globin gene expression (Evans andFelsenfeld, Cell 58:877-85 (1989); Tsai et al., Nature 339:446-51(1989)). Since then GATA-1 has been shown to be a member of a multigenefamily. Members of this gene family encode transcription factors thatrecognize the DNA core consensus sequence, WGATAR (SEQ ID NO:18). GATAfactors are key regulators of many important developmental processes invertebrates, particularly hematopoiesis (Orkin, Blood 80:575-81 (1992)).The importance of GATA-1 for hematopoiesis was definitively demonstratedin null mutations in mouse (Pevny et al., Nature 349:257-60 (1991)). Inchimeric mice, embryonic stem cells carrying a null mutation in GATA-1,created via homologous recombination, contributed to allnon-hematopoietic tissues tested and to a white blood cell fraction, butfailed to give rise to mature red blood cells.

[0093] In zebrafish, GATA-1 expression is restricted to erythroidprogenitor cells that initially occupy a ventral extra-embryonicposition, similar to the situation found in other vertebrates (Detrichet al., Proc Natl Acad Sci USA 92:10713-7 (1995)). As developmentproceeds, these cells enter the zebrafish embryo and form a distinctstructure known as the hematopoietic intermediate cell mass (ICM).

[0094] Vertebrate hematopoiesis is a complex process that proceeds indistinct phases, at various anatomic sites, during development (Zon,Blood 86:2876-91 (1995)). Although studies on in vitro model systemshave generated some insight into hematopoietic development (Cumano etal., Cell 86:907-16 (1996); Kennedy et al., Nature 386:488-493 (1997);Medvinsky and Dzierzak, Cell 86:897-906 (1996); Nakano et al., Science272:722-4 (1996)), the origin of hematopoietic progenitor cells duringvertebrate embryogenesis is still controversial. Therefore, an in vivomodel should be useful to determine precisely the cellular and molecularmechanisms involved in hematopoietic development. Such a model couldalso be used to identify compounds and genes that affect hematopoiesis.In mammals, since embryogenesis occurs internally, it is difficult tocarefully observe hematopoietic processes.

[0095] Zebrafish have a number of features that facilitate the study ofvertebrate hematopoiesis. Because development is external and embryosare nearly transparent, the migration of labeled hematopoietic cells canbe easily monitored. In addition, many mutants that are defective inhematopoietic development have been generated (Ransom et al.,Development 123:311-319 (1996); Weinstein et al., Development123:303-309 (1996)). Zebrafish embryos that significantly lackcirculating blood can survive for several days, so downstream effects ofmutations upon gene expression deleterious to embryonic hematopoieticdevelopment can be characterized. Since the cellular processes andmolecular regulation of hematopoiesis are generally conserved throughoutvertebrate evolution, results from zebrafish embryonic studies can alsoprovide insight into the mechanisms involved in mammalian hematopoiesis.

[0096] Cloning and sequencing of GATA-1 genomic DNA

[0097] A zebrafish genomic phage library was screened with a ³²Pradiolabeled probe containing a region of zebrafish GATA-2 cDNA thatencodes a conserved zinc finger. A number of positive clones wereidentified. The inserts in these clones were cut with variousrestriction enzymes. The resulting fragments were subcloned intopBluescript II KS(−) and sequenced. Based on DNA sequence analysis, twophage clones were shown to contain zebrafish GATA-1 sequences. The cDNAsequence of zebrafish GATA-1 is described by Detrich et al., Proc. Natl.Acad. Sci. USA 92:10713 (1995). Nucleotide sequence of the GATA-1promoter region is shown in SEQ ID NO:26.

[0098] Plasmid constructs

[0099] Construct G1-(Bgl)-GM2 was generated by ligating a modified GFPreporter gene (GM2) to a 5.4 kb EcoRI/BglII fragment that containsputative zebrafish GATA-1 expression sequences, that is, the 5′ flankingsequences upstream of the major GATA-1 transcription start site. GM2contains 5′ wild type GFP and a 3′ NcoI/EcoRI fragment derived from aGFP variant, m2, that emits approximately 30 fold greater fluorescencethan does the wild type GFP under standard FITC conditions (Cormack etal., Gene 173:33-8 (1996)). This construct is illustrated as construct(1) in FIG. 2.

[0100] To isolate expression sequences in the 5′ untranslated region ofGATA-1, a 5.6 kb DNA fragment was amplified by the polymerase chainreaction (PCR) from a GATA-1 genomic subclone using a T7 primer which iscomplementary to the vector sequence, and a specific primer, Oligo (1),that is complementary to the cDNA sequence just 5′ of the GATA-1translation start. The GATA-1 specific primer contained a BamHI site tofacilitate subsequent cloning. The PCR reaction was performed usingExpand™ Long Template PCR System (Boehringer Mannheim) for 30 cycles(94° C., 30 seconds; 60° C., 30 seconds; 68° C., 5 minutes). Afterdigestion with BamHI and XhoI, this 5.6 kb DNA fragment was gel purifiedand ligated to DNA encoding the modified GFP, resulting in constructG1-GM2 (construct (2) in FIG. 2). The construct G1-(5/3)-GM2 wasgenerated by ligating an additional 4 kb of GATA-1 genomic sequences,which contains GATA-1 intron and exon sequences, to the 3′ end(following the polyadenylation signal) of the reporter gene in constructG1-GM2. This construct is illustrated as construct (3) in FIG. 2.

[0101] Fish and Microinjection

[0102] Wild type zebrafish embryos were used for all microinjections.The zebrafish were originally obtained from pet shops (Culp et al., ProcNatl Acad Sci USA 88:7953-7 (1991)). Fish were maintained on reverseosmosis-purified water to which Instant Ocean (Aquarium Systems, Mentor,Ohio.) was added (50 mg/l). Plasmid DNA G1-GM2 was linearized usingrestriction enzyme AatII (which cuts in the vector backbone), whileplasmid DNA G1-(5/3)-GM2 was excised from the vector by digestion withrestriction enzyme SacI, and separated using a low melting agarose gel.DNA fragments were cleaned using GENECLEAN II Kit (Bio101 Inc.) andresuspended in 5 mM Tris, 0.5 mM EDTA, 0.1 M KCl at a finalconcentration of 50 μg/ml prior to microinjection. Single cell embryoswere prepared and injected as described by Culp et al., Proc Natl AcadSci USA 88:7953-7 (1991), except that tetramethyl-rhodamine dextran wasincluded as an injection control. This involved collecting newlyfertilized eggs, dechorionating the eggs with pronase (used at 0.5mg/ml), and injecting DNA. Injection with each construct was doneindependently 5 to 10 times and the data obtained were pooled.

[0103] Fluorescent microscopic observation and imaging

[0104] Embryos and adult fish were anesthetized using tricaine (SigmaA-5040) as described previously (Westerfield, The Zebrafish Book(University of Oregon Press, 1995)) and examined under a FITC filter ona Zeiss microscope equipped with a video camera. Images of circulatingblood cells were produced by printing out individual frames of recordedvideos. Other pictures of fluorescent embryos were generated bysuperimposing a bright field image on a fluorescent image using AdobePhotoshop software. One month old fish were anesthetized and thenrapidly embedded in OCT. Sections of 60 μm were cut using a cryostat andwere immediately observed by fluorescence microscopy.

[0105] Identification of germline transgenic fish by PCR

[0106] DNA isolation, internal control primers and PCR conditions werethe same as described by Lin et al. Dev Biol 161:77-83 (1994)). Briefly,DNA was extracted from pools of 40 to several hundred dechorionatedembryos (obtained from mating a single pair of fish) at 16 to 24 hoursof development by vortexing for 1 minute in a buffer containing 4 Mguanidium isothiocyanate, 0.25 mM sodium citrate (pH 7.0), and 0.5%Sarkosyl, 0.1 M β-mercaptoethanol. The sample was extracted once withphenol:chloroform: isoamyl alcohol (25:24:1) and total nucleic acid wasprecipitated by the addition of 3 volumes of ethanol and 1/10 volumesodium acetate (3 M, pH 5.5). The pellet was washed once in 70% ethanoland dissolved in 1×TE (pH 8.0).

[0107] Approximately 0.5 μg of DNA was used in a PCR reaction containing20 mM Tris (pH 8.3), 1.5 mM MgCl₂, 25 mM KCl, 100 μg/ml gelatin, 20pmole each PCR primer, 50 μM each dNTPs, 2.5 U Taq DNA polymerase(Pharmacia). The reaction was carried out at 94° C. for 2.5 minutes for30 cycles with a 5 minute initial 94° C. denaturation step, and a 7minute final 72° C. elongation step. Specific primers, Oligos (2) and(3), that were used to detect GFP, generated a 267 bp product. A pair ofinternal control primers homologous to sequences of the zebrafishhomeobox gene, ZF-21 (Njolstad et al., FEBS Letters 230:25-30 (1988)),was included in each reaction. This pair of primers should generate aPCR product of 475 bp for all PCR reactions using zebrafish DNA.

[0108] Preparation of embryonic cells and flow cytometry

[0109] Embryos were disrupted in Holfereter's solution using a 1.5 mlpellet pestle (Kontes Glass, OEM749521-1590). Cells were collected bycentrifugation (400 g, 5 minutes). After digestion with 1×Trypsin/EDTAfor 15 minutes at 32° C., the cells were washed twice with phosphatebuffered saline (PBS) and filtered through a 40 micron nylon mesh.Fluorescence activated cell sorting (FACS) was performed under standardFITC conditions.

[0110] cDNA synthesis and PCR

[0111] Total RNA was extracted from FACS purified cells using the RNAisolation kit, TRIZoL (Bio101). Reverse transcription and PCR (RT-PCR)were performed using the Access RT-PCR System from Promega (Catalog #A1250). Specific primers, Oligos (4) and (5), used to detect thezebrafish GATA-1 cDNA, generated a 410 bp product.

[0112] Oligonucleotides

[0113] (1) 5′-CCGGATCCTGCAAGTGTAGTATTGAA-3′ (GATA-1, promoter antisense;SEQ ID NO:1);

[0114] (2) 5′-AATGTATCAATCATGGCAGAC-3′ (GM2 sense; SEQ ID NO:2);

[0115] (3) 5′-TGTATAGTTCATCCATGCCATGTG-3′ (GM2 antisense; SEQ ID NO:3);

[0116] (4) 5′-ATGAACCTTTCTACTCAAGCT-3′ (GATA-1, cDNA sense; SEQ ID NO:4)

[0117] (5) 5′-GCTGCTTCCACTTCCACTCAT-3′ (GATA-1, cDNA antisense; SEQ IDNO:5)

[0118] Whole-mount RNA in situ hybridization

[0119] Sense and antisense digoxigenin-labeled RNA probes were generatedfrom a GATA-1 genomic subclone containing the second and third exoncoding sequence using a DIG/GeniusTM 4 RNA Labeling Kit (SP6/T7)(Boehinger Mannheim). RNA in situ hybridizations were performed asdescribed (Westerfield, The Zebrafish Book (University of Oregon Press,1995)).

[0120] Genomic structure of the zebrafish GATA-1

[0121] Two clones containing zebrafish GATA-1 sequences were isolatedfrom a lambda phage zebrafish genomic library as described above.Restriction enzyme mapping indicated that the two overlapping clonescontained approximately 35 kb of the GATA-1 locus. To define thepromoter of the zebrafish GATA-1 gene, transcription initiation sitesfor the zebrafish GATA-1 were mapped by primer extension. As in chicken,mouse, human and other species, multiple transcription initiation siteswere identified. A major transcription initiation site was mapped 187bases upstream of the translation start.

[0122] Comparison of the GATA-1 genomic structure for human, mouse andchicken suggested that the intron-exon junction sequences of this geneare likely to be conserved throughout vertebrates. Oligonucleotideprimers flanking potential GATA-1 introns were designed and used tosequence the zebrafish genomic clones. Sequence analysis revealed thatthe zebrafish GATA-1 gene consists of five exons and four introns whichlie within a 6.5 kb genomic region (FIG. 1). Although the exon-intronnumber and junction sequences are well conserved between zebrafish andother vertebrates, the zebrafish GATA-1 introns are smaller than inother species.

[0123] Transient expression of GFP driven by the GATA-1 promoter inzebrafish embryos

[0124] Based on the zebrafish GATA-1 genomic structure, three GFPreporter gene constructs were generated (FIG. 2). Construct G1-(Bgl)-GM2was generated by ligation of a modified GFP reporter gene (GM2) to a 5.4kb EcoRI/BglII fragment that contains the 5′ flanking sequences upstreamof the major GATA-1 transcription start site. Construct G1-GM2 containeda 5.6 kb region upstream of the translation start of GATA-1. The thirdconstruct, G1-(5/3)-GM2, was generated by ligating an additional 4 kb ofGATA-1 genomic sequences, which contain intron and exon sequences, tothe 3′ end of the reporter gene in construct G1-GM2. Each construct wasmicroinjected into the cytoplasm of single cell zebrafish embryos. GFPreporter gene expression in the embryos was examined at a number ofdistinct developmental stages by fluorescence microscopy.

[0125] GFP expression was observed in embryos injected with eitherconstruct G1-GM2 or construct Gl-(5/3)-GM2 as early as 80% epiboly,approximately 8 hours post fertilization (pf). At that time, GFPpositive cells were restricted to the ventral region of the injectedembryos. At 16 hours pf, GFP expression was clearly visible in thedeveloping intermediate cell mass (ICM), the earliest hematopoietictissue in zebrafish. After 24 hours pf, GFP positive cells were observedin circulating blood and could be continuously observed in circulatingblood for several months. During the first five days pf, examination ofcirculating blood revealed two distinct cell populations with differentlevels of GFP expression. One cell type was larger and brighter; theother smaller and less bright. No significant difference in GFPexpression levels was detected between embryos injected with eitherconstruct G1-GM2 or G1-(5/3)-GM2. However, injection of constructG1-(Bgl)-GM2 yielded very weak GFP expression in developing embryos.This result indicated that either the GATA-1 transcription initiationsite was removed by BglII restriction digestion, or that the 5′untranslated region of zebrafish GATA-1 is required for high leveltissue specific expression of GFP. It is not surprising that a constructlacking the 5′ untranslated region of GATA-1 did not generate much GFPexpression in microinjected embryos. These regions are often needed fortranscript stability. At times, these regions also contain binding sitesfor regulators of gene expression.

[0126] At least 75% of the embryos injected with G1-GM2 or G1-(5/3)-GM2construct showed some degree of ICM specific GFP expression (Table 2).The number of GFP positive cells in the ICM or in circulation rangedfrom a single cell to a few hundred cells. Less than 7% of these embryosshowed GFP expression in non-hematopoietic tissues, usually limited tofewer than ten cells per embryo. Non-specific expression of GFP wasusually observed in the notochord, muscle, and enveloping cell layers,and was limited to no more than 10 cells per embryo. These observationsindicated that a genomic GATA-1 fragment extending approximately 5.6 kbupstream from the GATA-1 translation start site ligated to GFP sufficedto recapitulate the embryonic pattern of GATA-1 expression in zebrafish.TABLE 2 No. embryos No. embryos No. embryos with strong with non- No.with GFP GFP specific observed expression in expression in expressionConstructs embryos ICM (%) ICM (%)^(a) GFP (%) G1-GM2 336 274 (81.5%)177 (52.7%) 15 (4.5%) G1-GM2(5/3) 248 187 (75.4%) 150 (60.5%) 16 (6.5%)G1(Bg1II)-GM2 370  0 (0%)  0 (0%) 19 (5.1%) # cells in the ICM.

[0127] GFP expression in germline GATA-1/GFP transgenic zebrafish

[0128] Microinjected zebrafish embryos were raised to sexual maturityand mated. Progeny were tested by PCR to determine the frequency ofgermline transmission of the GATA-1/GFP transgene. Nine of six hundredand seventy two founder fish have transmitted GFP to the F1 generation.Examination of these fish by fluorescence microscopy revealed that sevenof eight lines expressed GFP in the ICM and in circulating blood cells.GFP expression patterns in the ICM were consistent with the RNA in situhybridization patterns previously observed for GATA-1 mRNA expression inzebrafish (Detrich et al., Proc Natl Acad Sci USA 92:10713-7 (1995)). Inthe two lines where F2 transgenic fish have been obtained, GFPexpression in blood cells was observed in 50% of the progeny when atransgenic F2 was mated to a non-transgenic fish. This indicated thatGFP was transmitted to progeny in a Mendelian fashion. Southern blotanalysis showed that GFP transgene insertions occurred at differentsites in these two lines. In one line, transgenic fish apparently carry4 copies of the transgene and in the other line, 7 copies.

[0129] Blood cells were collected from 48 hour transgenic fish by heartpuncture and a blood smear was observed by fluorescence microscopy. Twodistinct populations of fluorescent cells were observed in these smears.As in the circulation of embryos that transiently express GFP, one cellpopulation was observed that was large and bright and another that wassmaller and less bright. Although the blood cells collected from adulttransgenic zebrafish showed some variability in fluorescence intensity,they appeared to have uniform size. Blood cells collected fromnon-transgenic fish showed no fluorescence.

[0130] In two day old transgenic zebrafish, weak GFP expression wasobserved in the heart. GFP expression was also observed in the eyes and,in three of seven transgenic lines, in some neurons of the spinal cord.Expression in the eyes peaked between 30 and 48 hours pf and becameextremely weak by day 4. It is thought that expression of GFP in eyesand neurons may replicate the authentic GATA-1 expression pattern.

[0131] Examination of GFP expression in tissues of one month old fishshowed that the head kidney contained a large number of fluorescentcells. This result suggests that the kidney is the site of adulterythropoiesis in zebrafish. It has been reported that GATA-1 isexpressed in the testes of mice. Expression of GFP was not found intestes dissected from adult fish. It is possible that the disclosedGATA-1 transgene constructs lack an enhancer required for testisexpression of GATA-1. Other tissues including brain, muscle and liverhad no detectable level of GFP expression.

[0132] FACS analysis of GATA-1/GFP transgenic fish

[0133] GFP expression in GATA-1/GFP transgenic fish allowed isolation ofa pure population of the earliest erythroid progenitor cells for invitro studies by fluorescence activated cell sorting. F1 transgenicembryos were collected at the onset of GFP expression and cellsuspensions were prepared. Approximately 3.6% of the cell populations ofwhole transgenic fish were fluorescence positives as compared to 0.12%in the non-transgenic controls. Based on the number of embryos used,FACS analysis suggested that there are approximately three hundrederythroid progenitor cells per embryo at 14 hours pf.

[0134] To determine whether the FACS purified cells are enriched forGATA-1, RNA was isolated from these cells and GATA-1 mRNA levels weredetermined by RT-PCR. The results indicated that these cells were highlyenriched for GATA-1 mRNA.

[0135] Erythroid specific expression was observed in living embryosduring early development. Fluorescent circulating blood cells weredetected in microinjected embryos 24 hours after fertilization and couldstill be observed in two month old fish. Germline transgenic fishobtained from the injected founders continued to express GFP inerythroid cells in the F1 and F2 generations. The GFP expressionpatterns in transgenic fish were consistent with the RNA in situhybridization pattern generated for GATA-1 mRNA expression. Thesetransgenic fish allowed isolation, by fluorescence activated cellsorting, the earliest erythroid progenitor cells from developingembryos. Using constructs containing other zebrafish promoters and GFP,it will be possible to generate transgenic fish that allow continuousvisualization of the origin and migration of any lineage specificprogenitor cells in a living embryo.

[0136] The results described in this example indicate that monitoringGFP expression can be a more sensitive method than RNA in situ detectionby which to determine gene expression patterns. For instance, in thedisclosed GATA-1/GFP transgenic fish, GFP expression in circulatingblood allowed two types of cells to be distinguished. One cell type waslarger and brighter; the other smaller and less bright. There were fewerof the larger, brighter cell type. These cells are believed to beerythroid precursors while the more abundant, smaller cells are believedto be fully differentiated erythrocytes. Preliminary celltransplantation experiments with embryonic blood cells have shown thatthey contain a cell population that has long-term proliferationcapacity.

[0137] In two day old transgenic zebrafish, GFP expression was observedin the heart. In adult transgenic zebrafish, GFP expression was observedin the kidney. By histological methods, it has been shown that the heartendocardium is a transitional site for hematopoiesis in embryoniczebrafish and that the kidney is the site of adult hematopoiesis(Al-Adhami and Kunz, Develop. Growth and Differ. 19:171-179 (1977)). Theresults in GATA-1/GFP transgenic fish support these observations.

[0138] The GFP expression seen in the eyes and neurons of embryonictransgenic fish may be due to a lack of a transcriptional silencer inthe transgene constructs. It seems unlikely that the GFP expression inthe eyes is due to positional effects caused by the sites of insertionsince all seven transgenic lines have GFP expression in embryonic fisheyes.

[0139] Using fluorescence activated cell sorting, pure populations ofhematopoietic progenitor cells were isolated from the ICM of transgeniczebrafish. Since approximately 10⁷ cells can be sorted per hour, 10⁵ to10⁶ purified ICM cells can be obtained in a few hours. These cells,which are derived from the earliest site of hematopoiesis in zebrafish,can be used in a variety of in vitro studies. For instance, these purecell populations can provide mRNA for differential display orsubtractive screens for identifying novel hematopoietic genes. Erythroidprecursors obtained from the ICM might also be established in tissueculture. This would allow the growth factor needs of these cells to bedetermined.

[0140] The approach to obtaining and studying transgene expression inerythroid cells described above is generally applicable to the study ofany developmentally regulated process. This approach can also be appliedto the identification of cis-acting promoter elements that are requiredfor tissue specific gene expression (see Example 2). The analysis ofpromoter activity in a whole animal is desirable since dynamic temporaland spatial changes in a cellular microenvironment can be only poorlymimicked in vitro. The ease of generating and maintaining a large numberof transgenic zebrafish lines makes obtaining statistically significantresults practical. Finally, transgenic zebrafish that express GFP inspecific tissues provide useful markers for identifying mutations thataffect these lines in genetic screens. Given the genetic resources andembryological methods available for zebrafish, transgenic zebrafishexhibiting tissue-specific GFP expression is a very valuable tool fordissecting developmental processes.

Example 2

[0141] Identification of Enhancers in GATA-2 Expression Sequences.

[0142] A large number of studies have shown that neuronal celldetermination in invertebrates occurs in progressive waves that areregulated by sequential cascades of transcription factors. Much less isknown about this process in vertebrates. It was realized that anintegrated approach combining embryological, genetic and molecularmethods, such as that used to study neurogenesis in Drosophila (Ghysenet al., Genes & Dev 7:723-33 (1993)), would facilitate theidentification of the molecular mechanisms involved in specifyingneuronal fates in vertebrates. The following is an example ofidentification of cis-acting sequences that control neuron-specific geneexpression in a vertebrate. Such identification is an initial steptoward unraveling similar cascades in a vertebrate.

[0143] Transcription factors bind to cis-acting DNA sequences (sometimesreferred to as response sequences) to regulate transcription. Oftenthese transcription factors are members of multigene families that haveoverlapping, but distinct, expression patterns and functions. Thetranscription factor GATA-2 is a member of such a gene family (Yamamotoet al., Genes Dev 4:1650-62 (1990)). Each member of the GATA gene familyis characterized by its ability to bind to cis-acting DNA elements withthe consensus core sequence WGATAR (Orkin, Blood 80:575-81 (1992); SEQID NO:18). All protein products of the GATA family contain two copies ofa highly conserved structural motif, commonly known as a zinc finger,which is required for DNA binding (Martin and Orkin, Genes Dev 4:1886-98(1994)). Six members of the GATA family have been identified invertebrates (Orkin, Blood 80:575-81 (1992), Orkin, Curr Opin Cell Biol7:870-7 (1995)). Pannier, another member of the GATA gene family, isexpressed in Drosophila neuronal precursors and inhibits expression ofachaete-scute, a gene complex that plays a critical role in neurogenesisin Drosophila (Ramain et al., Development 119:1277-91 (1993)).

[0144] In chicken and mouse, the transcription factor GATA-2 isexpressed in hematopoietic precursors, immature erythroid cells,proliferating mast cells, the central nervous system (CNS), andsympathetic neurons (Yamamoto et al., Genes & Dev 4:1650-62 (1990),Orkin, Blood 80:575-81 (1992), Jippo et al., Blood 87:993-8 (1996)).Studies in zebrafish (Detrich et al., Proc Natl Acad Sci USA 92:10713-7(1995)) and Xenopus (Zon et al., Proc Natl Acad Sci USA 88:19642-6(1991), Kelley et al., Dev Biol 165:193-205 (1994)) have also shown thatGATA-2 expression is restricted to hematopoietic tissues and the CNS.Homozygous null mutants, created in mouse via homologous recombination,have profound deficits in all hematopoietic lineages (Tsai et al.,Nature 371:221-6 (1994)). The role played by GATA-2 in neuronal tissueof these mice has not been carefully examined, perhaps because theembryos die before day E11.5. Analysis of GATA-2 expression in chickembryonic neuronal tissue after notochord ablation has suggested thatGATA-2 plays a role in specifying a neurotransmitter phenotype (Groveset al., Development 121:887-901 (1995)). In addition, GATA factors arerequired for activity of the neuron-specific enhancer of thegonadotropin-releasing hormone gene (Lawson et al., Mol Cell Biol16:3596-605 (1996)).

[0145] The effects of various hematopoietic growth factors on GATA-2expression has been carefully studied in tissue culture systems (Weisset al., Exp Hematol 23:99-107 (1995)) and some growth factors have beenshown to have dramatic effects on early embryonic GATA-2 expression(Walmsley et al., Development 120:2519-29 (1994), Maeno et al., Blood88:1965-72 (1996)). In addition, nuclear translocation of a maternallysupplied CCAAT binding transcription factor has been shown to benecessary for the onset of GATA-2 transcription at the mid-blastulatransition in Xenopus (Brewer et al., Embo J 14:757-66 (1995)). However,prior to the disclosed work, nothing was known about the mechanisms thatcontrol neuron-specific expression of this gene.

[0146] Cloning and sequencing of 5′ part of GATA-2 genomic DNA

[0147] A zebrafish genomic phage library was screened with the conservedzinc finger domain of zebrafish GATA-2 cDNA radiolabeled with ³²P. Twopositive clones, λGATA-21 and λGATA-22, were identified. Restrictionfragments of λGATA-21 were subcloned into pBluescript II KS(−). DNAsequence of the resulting clones was obtained from −4807 to +2605relative to the GATA-2 translation start. Nucleotide sequence of theGATA-2 promoter region is shown in SEQ ID NO:27. Unless otherwiseindicated, positions within the GATA-2 clones use this numbering. The7.3 kb region upstream of the translation start in λGATA-21 wasamplified by the polymerase chain reaction (PCR) using Expand™ LongTemplate PCR System (Boehringer Mannheim) for 25 cycles (94° C., 30seconds; 68° C., 8 minutes). Primers used were a T7 primer and a primerspecific for sequences 5′ to the GATA-2 translation start site(5′-ATGGATCCTCAAGTGTCCGCGCTTAGAA-3′; SEQ ID NO:19). The GATA-2 specificprimer contained a BamHI site to facilitate subsequent cloning. The PCRproduct (P1) was cloned into the SmaI/BamHI sites of pBluescript IIKS(−).

[0148] Plasmid constructs

[0149] The 7.3 kb DNA fragment containing the putative GATA-2 expressionsequences (P1) was ligated to a modified GFP reporter gene (GM2,described above), resulting in construct P1-GM2 (FIG. 3). Based onP1-GM2, constructs containing successive 5′ deletions in the regionupstream of the transcription start site were generated using therestriction sites PstI, SacI, AatII, ClaI and ScaI in this upstreamregion (FIG. 3). Constructs nsP5-GM2 and nsP6-GM2 were generated byligating the 1116 bp fragment containing the GATA-2 neuron-specificenhancer from −4807 to −3690 to P5-GM2 and P6-GM2, respectively (FIG.4). The same fragment containing the neuron-specific enhancer was alsoligated to a 243 bp SphI/BamHI fragment of the Xenopus elongation factor1α (EF 1α) minimal promoter that had previously been ligated to the GM2gene, resulting in construct ns-XS-GM2 (FIG. 4). The EF 1α minimalpromoter has been described in Johnson and Krieg, Gene 147:223-6 (1994).

[0150] PCR mapping of neuron-specific enhancer

[0151] PCR technology was exploited to create a deletion series withinthe 1116 bp neuron-specific enhancer using nsP5-GM2 as a template. Atotal of 10 specific 22-mer primers were synthesized. These includedns4647, ns4493, ns4292, ns4092, ns3990, ns3872, ns3851, ns3831, ns3800and ns3789, in which the numbers refer to the positions of their 5′ endbase in the GATA-2 genomic sequence. A T7 primer was also used in thePCR reactions. The amplified fragments all contained the GM2 gene andSV40 polyadenylation signal in addition to the GATA-2 expressionsequences. PCR reactions were performed using Expand™ Long Template PCRSystem (Boehringer Mannheim) for 25 cycles (94° C., 30 seconds; 55° C.,30 seconds; 72° C., 2 minutes). The PCR products were purified withGENECLEAN II Kit (Bio 101 Inc.) and subsequently used formicroinjection.

[0152] After a 31 bp neural-specific enhancer was identified, fiveadditional primers, each containing 2 or 3 mutant bases relative to thewild type enhancer sequence, were designed. These primers are (themutant bases are underlined):

[0153] ns3831 5′-TCTGCGCCGCTTTCTGCCCCCTCCTGCCCTCTT-3′ (SEQ ID NO:20)

[0154] ns3831M1 5′-TCTGCGAAGCTTTCTGCCCCCTCCTGCCCTCTT-3′ (SEQ ID NO:21)

[0155] ns3831M2 5′-TCTGCGCCGCTTTCTGAACCCTCCTGCCCTCTT-3′ (SEQ ID NO:22)

[0156] ns3831M3 5′-TCTGCGCCGCTTTCTGCCAACTCCTGCCCTCTT-3′ (SEQ ID NO:23)

[0157] ns3831M4 5′-TCTGCGCCGCTTTCTGCCCCAAACTGCCCTCTT-3′ (SEQ ID NO:24)

[0158] ns3831M5 5′-TCTGCGCCGCTTTCTGCCCCCTCCTGCCCTCTT-3′ (SEQ ID NO:25)

[0159] These primers were used in conjunction with the T7 primer for PCRamplification of the target sequence using the nsP5-GM2 as the template.PCR conditions were identical to those described above.

[0160] Microinjection of zebrafish

[0161] Wild-type zebrafish were used for all microinjections. PlasmidDNA was linearized using single-cut restriction sites in the vectorbackbone, purified using GENECLEAN II Kit (Bio 101 Inc.), andresuspended in 5 mM Tris, 0.5 mM EDTA, 0.1 M KCl at a finalconcentration of 100 μg/ml. Single cell embryos were microinjected asdescribed above. Each construct was injected independently 2 to 5 timesand the data obtained were pooled.

[0162] Fluorescent microscopic observation

[0163] Embryos were anesthetized using tricaine as described above andexamined under a FITC filter on a Zeiss microscope equipped with a videocamera. Pictures showing GFP positive cells in living embryos weregenerated by superimposing a bright field image on a fluorescent imageusing Adobe Photoshop software.

[0164] Whole-mount RNA in situ hybridization

[0165] Sense and antisense digoxigenin-labeled RNA probes were generatedfrom a GATA-2 cDNA subclone containing a 1 kb fragment of the 5′ codingsequence using DIG/Genius™ 4 RNA Labeling Kit (SP6/T7) (BoehingerMannheim). RNA in situ hybridizations were performed as described byWesterfield (The Zebrafish Book (University of Oregon Press, 1995)).

[0166] Isolation of GATA-2 genomic DNA

[0167] Two GATA-2 positive phage clones, λGATA-21 and λGATA-22, wereidentified as described above. Preliminary restriction analysissuggested that λGATA-21 contained a large region upstream of thetranslation start codon. 7412 bp of this clone was sequenced from −4807to +2605 relative to the translation start site. The putative GATA-2expression sequences (P1) containing approximately 7.3 kb upstream ofthe translation start site from the λGATA-21 was subcloned into aplasmid vector for expression studies.

[0168] Expression pattern of a modified GFP gene driven by the putativeGATA-2 promoter in zebrafish embryos

[0169] The construct P1-GM2 was generated by ligation of a modified GFPreporter gene (GM2) to P1 (FIG. 3). This construct was injected into thecytoplasm of single cell zebrafish embryos and GFP expression in themicroinjected embryos was examined at a number of distinct developmentalstages by fluorescence microscopy.

[0170] GFP expression was initially observed by fluorescence microscopyat the 4000 cell stage at about 4 hours post-injection (pi). At thedorsal shield stage (6 hours pi), GFP expression was observed throughoutthe prospective ventral mesoderm and ectoderm but expression in thedorsal shield was extremely rare. At 16 hours pi, GFP expression wasobserved in the developing intermediate cell mass (ICM), the earlyhematopoietic tissue of zebrafish. In addition, GFP expression could beseen in superficial EVL cells at 4 hours pi. Expression in the EVLpeaked between 24 and 48 hours pi and became extremely weak by day 7.GFP expression in neurons, including extended axons, was first observedat 30 hours pi and was maintained at high levels through at least day 8.

[0171] Embryos injected with the P1-GM2 construct expressed GFP in amanner restricted to hematopoietic cells, EVL cells, and the CNS. TheGFP expression patterns in gastrulating embryos, in the blood progenitorcells, and in neurons were consistent with the RNA in situ hybridizationpatterns previously generated for GATA-2 mRNA expression in zebrafish(Detrich et al., Proc Natl Acad Sci USA 92:10713-7 (1995)). However,GATA-2 expression in EVL has not been detected by RNA in situhybridizations.

[0172] More than 95% of the embryos injected with P1-GM2 had tissuespecific GFP expression (Table 3). About 5% of these embryos hadnon-specific GFP expression, limited to fewer than five cells perembryo. These observations indicated that the DNA fragment extendingapproximately 7.3 kb upstream from the GATA-2 translation start sitesufficed to correctly generate the embryonic tissue-specific pattern ofGATA-2 gene expression. TABLE 3 No. embryos No. with embryos No. No.circulating with embryos No. embryos blood neuronal with EVL embryoswith expression expression expression Construct observed expression (%)(%) (%) P1-GM2 141 135  3 (2.13) 106 (75.2) 130 (92.2) P2-GM2 198 177 32(15.7) 136 (68.7) 175 (88.4) P3-GM2 303 291 29 (9.6)  0 (0) 277 (91.4)P4-GM2 143 126 21 (14.7)  0 (0) 118 (82.5) P5-GM2 139  90 16 (11.5)  0(0)  20 (14.4) P6-GM2 138  44  2 (1.4)  0 (0)  11 (8.0)

[0173] Gross mapping of tissue-specific enhancers

[0174] To identify the portions of the GATA-2 expression sequences thatare responsible for regulating tissue specific gene expression, severalconstructs containing deletions in the promoter were generated (FIG. 3).Naturally occurring restriction sites were used to create a series ofgross deletions in the expression sequence region. Each construct wasindividually microinjected into single cell embryos. The developingembryos were observed by fluorescence microscopy at regular intervalsfor several days.

[0175] Embryos injected with P2-GM2, which contains GATA-2 sequencesfrom −4807 to +1, expressed GFP in a manner similar to embryos injectedwith the original construct, P1-GM2 (Table 3). At 48 hr pi, GFPexpression was observed in circulating blood cells, the CNS and the EVL.However, careful observation of the injected embryos at 16 hr pirevealed that expression in the posterior end of the ICM was nearlyabolished. This suggested that an enhancer for GATA-2 expression inearly hematopoietic progenitor cells may reside in the deleted region.Expression of GFP in circulating blood cells increased fromapproximately 2% to 16%, suggesting that a potential repressor forexpression of GATA-2 in erythrocytes may also reside in the deletedregion.

[0176] Embryos injected with P3-GM2, which contains GATA-2 sequencesfrom −3691 to +1, expressed GFP in circulating blood cells and in theEVL, but did not express in the CNS. Embryos injected with otherconstructs that lack the deleted 1116 bp region, extending from −4807 to−3692, also had no GFP expression in the CNS (Table 3). It was concludedthat the 1116 bp region, extending from −4807 to −3692, contained aneuron-specific enhancer element.

[0177] Embryos injected with P4-GM2, which contains GATA-2 sequencesfrom −2468 to +1, had a GFP expression pattern similar to those injectedwith P3-GM2. Injection with P5-GM2, which contains GATA-2 sequences from−1031 to +1, resulted in a sharp drop with respect to percentage ofembryos expressing GFP in the EVL, but GFP expression in circulating.blood cells was unaffected. This indicates that the 1437 bp region,extending from −2468 to −1032, contains an EVL-specific enhancer. The1031 bp segment present in P5-GM2 may represent the minimal expressionsequences necessary for the maintenance of tissue specific expression ofGATA-2.

[0178] Neuron-specific enhancer activity

[0179] To confirm the neuron-specific enhancer activity of the 1116 bpregion that spans from −4807 to −3692 of GATA-2, nsP5-GM2 wasconstructed by ligating the 1116 bp fragment to P5-GM2, which containsthe 1031 bp region upstream of the translation start of GATA-2 geneoperably linked to a sequence encoding GM2 (FIG. 4). Approximately 70%of the embryos injected with nsP5-GM2 had GFP expression in the CNS(FIG. 5), while no embryos injected with P5-GM2 had GFP expression inthe CNS as noted in Table 3. This indicates that the 1116 bp region caneffectively direct neuron-specific expression.

[0180] To determine whether the 1116 bp neuron-specific enhanceractivity was context dependent, the construct ns-Xs-GM2 (FIG. 4) wasgenerated by ligating the enhancer to the Xenopus elongation factor 1αminimal promoter (Johnson and Krieg, Gene 147:223-6 (1994)) operablylinked to the sequence encoding GM2 (Xs-GM2; FIG. 4). When injected withXs-GM2, embryos expressed GFP in various tissues including muscle,notochord, blood cells and melanocytes. However, no GFP expression wasobserved in the CNS (FIG. 5). Injection with ns-XS-GM2 resulted in 8.5%of the embryos having GFP expression in the CNS, far less than obtainedby injection with nsP5-GM2 (FIG. 5). Another construct, nsP6-GM2 (FIG.4), had an additional 653 bp deletion in the GATA-2 minimal expressionsequence, extending from −1031 to −378. Injection of nsP6-GM2 resultedin 6.2% of embryos expressing GFP in the CNS (FIG. 5). Injection withP6-GM2 resulted in no GFP expression in the CNS (Table 3). These resultssuggests that the 1116 bp enhancer has some ability to confer neuronalspecificity on a heterogeneous promoter, but requires proximal elementswithin its own promoter to exert its full activity.

[0181] Fine mapping of a neuron-specific cis-acting regulatory element

[0182] To precisely map the putative neuron-specific enhancer, a seriesof constructs containing progressive deletions in the 1116 bp DNAfragment was generated by PCR, using nsP5-GM2 as the template. The PCRproducts obtained were used directly for microinjection. The firstdeletion series included ns4647, ns4493, ns4292, ns4092 and ns3990(where the number indicates the upstream endpoint of the deletedfragment). Microinjection of all 5 mutants gave a similar percentage ofembryos having GFP expression in the CNS (FIG. 6). This indicated that aneuron-specific enhancer resides within the 298 bp sequence (from −3990to −3692) contained in ns3990.

[0183] Next, two additional deletion constructs, ns3872 and ns3789, weregenerated. As shown in FIG. 6, over 60% of embryos injected with ns3872had GFP expression in the CNS, while embryos injected with ns3789 lackedGFP expression in the CNS. This indicated that the neuron-specificenhancer element was located within a 83 bp sequence from −3872 to−3790.

[0184] Injection of embryos with three additional deletion constructsns3851, ns3831 and ns3800 allowed localization of the neuron-specificenhancer element to a 31 bp pyrimidine-rich sequence. This element hasthe sequence 5′-TCTGCGCCGCTTTCTGCCCCCTCCTGCCCTC-3′ (nucleotides 1 to 31of SEQ ID NO:20), which extends from −3831 to −3801 within the GATA-2genomic DNA.

[0185] Site directed mutagenesis within neuron-specific enhancer element

[0186] To determine the core sequence necessary for the activity of theneuron-specific element, five primers, each having two to three alterednucleotides within the 31 bp neuron-specific element (see above), wereused to amplify nsP5-GM2. The PCR products obtained were directlyinjected into single cell embryos. This 31 bp sequence contains anEts-like recognition site (AGGAC) in an inverted orientation which ispresent in several neuron-specific promoters (Chang and Thompson, J.Biol Chem 271:6467-75 (1996), Charron et al., J. Biol Chem 270:30604-10(1995)). Therefore, four of the primers used in these PCR reactionscontain altered nucleotides within the Ets-like recognition site or inthe adjacent sequence. As expected, embryos injected with ns3831M1,which contains two mutant nucleotides that are thirteen nucleotidesupstream of the Ets-like recognition site, showed little change inneuron-specific GFP expression (FIG. 7). A mutation of 2 nucleotides(ns3831M2) that lie three nucleotides upstream of the Ets-likerecognition site had no effect on enhancer activity (FIG. 7). Mutationof two nucleotides just one nucleotide upstream of the Ets-like motif,contained in ns3831M3, completely eliminated the neuron-specificenhancer activity of the 31 bp element (FIG. 7). Mutation of threenucleotides (ns3831M4), of which two lie within the Ets-like recognitionsite, also resulted in a sharp decrease in enhancer activity (FIG. 7). Amutation of two nucleotides that lie within the Ets-like recognitionsite (ns3831M5) reduced the neuron-specific enhancer activity of the 31bp element by approximately 50% (FIG. 7). From this it was concludedthat a CCCTCCT motif, which partially overlaps the Ets-like recognitionsite within the 31 bp sequence, is absolutely required forneuron-specific enhancer activity.

[0187] This dissection of expression sequences using transgenic fish,exemplified in zebrafish and with GATA-2 as described above, provides asystem that allows the rapid and efficient identification of thosecis-acting elements that play key roles in modulating the expression ofdevelopmentally regulated genes. Identification of these cis-actingelements is a useful step toward determining the genes that operateearlier than the gene under study in the specification of adevelopmental pathway (since the identified distal regulatory elementsinteract with transcription factors which must be expressed for theregulatory elements to function).

[0188] Careful analysis of GATA-2 promoter activity in zebrafish embryosrevealed three distinct tissue specific enhancer elements. These threeelements appear to act independently to enhance gene expressionspecifically in blood precursors, the EVL, or the CNS. Deletion of oneor two of the elements will generate transgene constructs that can driveexpression of a gene of interest in a specific tissue. Such constructsalso allow study of the tissue-specific function of genes expressed inmultiple tissues.

[0189] It has been shown that the developmental regulation of themammalian HOX6 and GAP-43 promoter activities is conserved in zebrafish(Westerfield et al., Genes Dev 6:591-8 (1992), Reinhard et al.,Development 120:1767-75 (1994)). If the same neuron-specific elementidentified in the zebrafish GATA-2 promoter is also shown to be requiredfor neuron-specific activity of the mouse promoter, one couldspecifically knockout expression of GATA-2 in the mouse CNS by targetingthis cis-element. This would allow one to determine precisely the rolethat GATA-2 plays in the CNS.

[0190] The neuron-specific enhancer element of GATA-2 has been preciselymapped and found to contain the core DNA consensus sequence for bindingby Ets-related transcription factors. Although Ets-related factors havebeen implicated in the regulation of expression of a number ofneuron-specific genes (Chang and Thompson, J. Biol Chem 271:6467-75(1996), Charron et al., J. Biol Chem 270:30604-10 (1995)), anothersequence, CCTCCT, present in this region of the zebrafish GATA-2promoter was found to be required for expression in the CNS. This motifpartially overlaps an inverted form of the core sequence of the Ets DNAbinding recognition site. As has been shown for other genes, theactivities of Ets family proteins often rely more on their ability tointeract with other transcription factors than on specific binding to acognate DNA sequence (Crepieux et al., Crit Rev Oncog 5:615-38 (1994)).It is possible that an independent factor that binds to the CCTCCT motifis required for neuron-specific activity of the GATA-2 promoter.

[0191] A number of growth factors are known to affect early embryonicexpression of GATA-2. Noggin and activin, which both have dorsalizingactivity in Xenopus embryos, downregulate GATA-2 expression in dorsalmesoderm (Walmsley et al., Development 120:2519-29 (1994)). BMP-4activates GATA-2 expression in ventral mesoderm and is probablyimportant to early blood progenitor proliferation (Maeno et al., Blood88:1965-72 (1996)). Growth factors that might affect expression ofGATA-2 in neurons are not known. However, both BMP-2 and BMP-6 canactivate neuron-specific gene expression (Fann and Patterson, J.Neurochem 63:2074-9 (1994)). Consistent with studies on growth factorsthat upregulate or downregulate GATA-2 expression, GATA-2 promoteractivity was excluded from the zebrafish dorsal shield. It has also beendiscovered that lithium chloride treatment dorsalizes the injectedembryos and dramatically reduces GATA-2 promoter activity as determinedby GFP expression.

[0192] Although GATA-2 expression has not been observed in the EVL by insitu hybridization on whole embryos, this may be due to the conditionsused. In mouse, embryonic mast cells present in the skin have only beendetected by in situ hybridization performed on skin tissue sections(Jippo et al., Blood 87:993-8 (1996)). Interestingly, expression ofGATA-2 in mouse skin mast cells occurs only during a short period ofembryogenesis, similar to what has been found for EVL cells inzebrafish. It is possible that the constructs used in this example maybe missing elements that would specifically silence GATA-2 expression inthe zebrafish EVL.

[0193] The method described above is generally applicable to thedissection of any developmentally regulated vertebrate promoter. Tissuespecific and growth factor response elements can be rapidly identifiedin this manner. The fact that zebrafish typically produce hundreds offertilized eggs per mating facilitates obtaining statisticallysignificant results. While tissue culture systems have been useful foridentifying many important transcription factors, transfection analysisin tissue culture cells cannot simulate the complex, rapidly changingmicroenvironment to which the promoter must respond duringembryogenesis. Temporal and spatial analysis of promoter activity can beonly poorly mimicked in vitro. The system described herein allowscomplete analysis of promoter activity in all tissues of a wholevertebrate.

1 27 1 26 DNA Artificial Sequence Description of Artificial Sequence;Note = Synthetic Construct 1 ccggatcctg caagtgtagt attgaa 26 2 21 DNAArtificial Sequence Description of Artificial Sequence; Note = SyntheticConstruct 2 aatgtatcaa tcatggcaga c 21 3 24 DNA Artificial SequenceDescription of Artificial Sequence; Note = Synthetic Construct 3tgtatagttc atccatgcca tgtg 24 4 21 DNA Artificial Sequence Descriptionof Artificial Sequence; Note = Synthetic Construct 4 atgaacctttctactcaagc t 21 5 21 DNA Artificial Sequence Description of ArtificialSequence; Note = Synthetic Construct 5 gctgcttcca cttccactca t 21 6 23DNA Artificial Sequence Description of Artificial Sequence; Note =Synthetic Construct 6 agacacagtc caggtgagtc caa 23 7 23 DNA ArtificialSequence Description of Artificial Sequence; Note = Synthetic Construct7 ctttcgccac ctggtatgtt gtg 23 8 22 DNA Artificial Sequence Descriptionof Artificial Sequence; Note = Synthetic Construct 8 aaaaagaggctggtatgtaa aa 22 9 22 DNA Artificial Sequence Description of ArtificialSequence; Note = Synthetic Construct 9 aaactgcaca atgtgagtat ac 22 10 21DNA Artificial Sequence Description of Artificial Sequence; Note =Synthetic Construct 10 attaaaacag ttcgccaagt c 21 11 21 DNA ArtificialSequence Description of Artificial Sequence; Note = Synthetic Construct11 aattttacag aggctcgtga a 21 12 22 DNA Artificial Sequence Descriptionof Artificial Sequence; Note = Synthetic Construct 12 cctgcatcagattgtcagca aa 22 13 22 DNA Artificial Sequence Description of ArtificialSequence; Note = Synthetic Construct 13 ctttttgcag gtcaacaggc ct 22 14 8PRT Artificial Sequence Description of Artificial Sequence; Note =Synthetic Construct 14 Arg His Ser Pro Val Arg Gln Val 1 5 15 8 PRTArtificial Sequence Description of Artificial Sequence; Note = SyntheticConstruct 15 Leu Ser Pro Pro Glu Ala Arg Glu 1 5 16 8 PRT ArtificialSequence Description of Artificial Sequence; Note = Synthetic Construct16 Lys Lys Arg Leu Ile Val Ser Lys 1 5 17 8 PRT Artificial SequenceDescription of Artificial Sequence; Note = Synthetic Construct 17 LysLeu His Asn Val Asn Arg Pro 1 5 18 6 PRT Artificial Sequence Descriptionof Artificial Sequence; Note = Synthetic Construct 18 Trp Gly Ala ThrAla Arg 1 5 19 28 DNA Artificial Sequence Description of ArtificialSequence; Note = Synthetic Construct 19 atggatcctc aagtgtccgc gcttagaa28 20 33 DNA Artificial Sequence Description of Artificial Sequence;Note = Synthetic Construct 20 tctgcgccgc tttctgcccc ctcctgccct ctt 33 2133 DNA Artificial Sequence Description of Artificial Sequence; Note =Synthetic Construct 21 tctgcgaagc tttctgcccc ctcctgccct ctt 33 22 33 DNAArtificial Sequence Description of Artificial Sequence; Note = SyntheticConstruct 22 tctgcgccgc tttctgaacc ctcctgccct ctt 33 23 33 DNAArtificial Sequence Description of Artificial Sequence; Note = SyntheticConstruct 23 tctgcgccgc tttctgccaa ctcctgccct ctt 33 24 33 DNAArtificial Sequence Description of Artificial Sequence; Note = SyntheticConstruct 24 tctgcgccgc tttctgcccc aaactgccct ctt 33 25 33 DNAArtificial Sequence Description of Artificial Sequence; Note = SyntheticConstruct 25 tctgcgccgc tttctgcccc ctcctgccct ctt 33 26 5563 DNAArtificial Sequence Description of Artificial Sequence; Note = SyntheticConstruct 26 gaattctagt tctagggtaa actatacagt ttttttaatt aataaagttggtggaggtaa 60 atgtctttaa tgagtaagtc actgaatcat ttattcattt gatttgttcaaacagttgat 120 tcatttagaa attcattaga aatcaarctg cagtctttat gaacgacccgttaaaccttt 180 agtttatgtg attggaatca aaaccccact gtgtgttaat cagatgaatgctgaaaagca 240 cagacaggtt ttaatccatc atgccattcc ttctagaaag gaaacattagtaatggtttt 300 aattttcagc attttaataa ccacaagcac atttctaatg caatgaaatcatatttgcaa 360 accaaaacag ctgattcttg aaatggccta cacagagtcc agacctgaatattatagaga 420 tggtgcagta tcacttgaaa gaaaaataaa cattaatctt aaatctaaagaacttaaatc 480 taaagaagca ctatgagaaa tgctgaaaaa gcctgatttt acatagcacattatttaaaa 540 tgaaacctca gggacagtat acagaacagt tcaaatacag tatacagtaaacagaacagg 600 tcaggtcaca ccaaatactg gcaagccatt ttattctgaa aatgtttcatttagattaga 660 acagaagaac tanagagacc nnnaaagttg gctgaatata aataaatataccactgcttt 720 gacggytcta gacttttgca cagtacttaa atgcagtact taaagtaattcntcatttag 780 atgagctaag taaactatga gttgtgaaaa aacacaccat tgtgtgatgagcagtgaggg 840 tgtcactgta gctgtgaatt tgttcatgta gtgccattac tagttatacgatccccaacc 900 tcccactcca atntagatag cttcttatca cagttcagca gcagcgcacacacacagaaa 960 cacacacaca gccacatccn tcaaaantgg tctttggaga cttctttctctttgaccgtt 1020 tagttttcgt gagcataatt aagttactct atacaataaa atgtgagtaaatggacacca 1080 tagatgtcta aataaataaa cacataaata aaaagatgac actttcacataacaccatca 1140 aacagcttca taaaattata ttatatagaa tattctataa ttatgttgatttgtaacgca 1200 ctgtaaaaaa aggattactg ccttaaattg ataatttgtt gaagaaaatttactttcctg 1260 aacatttatt gtattaatat attacagtac gctcaataat acatgtgaaactgcagcttc 1320 atatttttaa atgttttaat gtatttaata tatatatata taatatttatatatatatgt 1380 atgcatgtat gcatatttat tctgttgaaa ggagattagt tttattcaacacattagttt 1440 taataactcg tttctaataa ctgatttctt ttatctttgt catgatgacagtaaataata 1500 tttgactaga tatttttcaa gacatttcta taccacttaa agtgacatttaaaggcttaa 1560 ctaggttaat taggttaagt aagcaggtta gggtaattgg gtaagttattgtacaacaat 1620 ggtttgttct gtagactatt gaaaaaaatg gcttaaaggg gctaataattttgtccctta 1680 aaatggtgtt taaaaatgta aactgctttt attgtggctg aaaaaacaaataagaatttc 1740 tccagaaaaa aaaatattat cagacactgt gaaaatgtcc ttactctgttaaacataatt 1800 tgtgaaatat gtaaaaaaga ataaaaaatt cacatggggg gtgataacttcaactacaca 1860 cacacacaca cacacacaca cacatttcag tgaccaaaat atgttgtrggtttntktntt 1920 cattgatata aaatgtgcga tgccatttcm aaaatccata tatagtttatgcaacattat 1980 attggamcca aaataagtaa tatacaaaat aagtagtatt atcttatccagtatatttga 2040 gtatttatat atcgaagttt agattcytaa tttaacaata tttatgaattatatgtttaa 2100 gttctaaaac aacacctcat gtaaatcaat aacatggtgc ttggtacagtatgctcaata 2160 atacatgaaa aactgcagct tcatatttaa aaatgttatt gtatgcaattacatgtacaa 2220 ttacaaataa cgtatggtaa tgtatacaaa tatatattta gtaatagagggtataatata 2280 tgtgatgcac atgcgaaaaa atatatcaca cacacacgca cgcacgcacacacacacaca 2340 cacacacatt tatttatgca tatgtacact ataaaaccca aaaagttaaactcaaaccat 2400 ttaaggaaac tgattgcaac aaaccattaa agttgaaaaa cgaatcctaatgagtactgt 2460 aaactgaatn tatttgagta aacgaagcaa tttgaggaca gtaaaacccaataaatgaag 2520 agaactcaaa ccaactgagc actgtaaaac ctaacaagtt aaggcaactcaaaccgtttg 2580 aggaaatcga tataagagtc ctgtgaactg tatttaatta actcattacttcaaaactct 2640 tttcaaatta gtagaattaa cattcagtac attttgagtt actacactcatttcatttga 2700 taaagttgac tgttgggttt tacagtgtat ctttttatta atttatataagaacatgtgt 2760 ggataatata agtacattta ttaacatcat tatatatgtg gcttcagctttatgcaaatg 2820 ctgaaagtta acgaattgaa atcaattaag catttcagta acataacacgtattgtaggt 2880 tttgtcttca ttgatataca catgcaatgc atttcaagtc atttataattgatgcattat 2940 attgtattgt accaatgtaa gtaatatata atatactata ttatattatccagtatattt 3000 gactttaaaa tattaaagtt tagattccta atgtaacaat acatatataatatgttaagg 3060 ttctagaatg gaaccttatg taaatcaawa acctggcgct tggtgaaggatttgcttctc 3120 tgratctcat cccagtttcc ctgaaaatta taaatgcaca atggtggarggaagttgaaa 3180 gtgttttgcc tgtcaaatga rartgacagt cttagtcctg tgctccggcagsccgttctg 3240 cgtccgtatc tctcaccatg attgcagcat tkgagtttat ttgcattactgttctttgct 3300 gagctgcacc aggggaaaag tgcttttgca ttttcattcg ctttgttcacagtcaccgtt 3360 tccatcccaa gtgctctttg ttaacacttt gcacgccatt ttaattgccaaatgtattag 3420 gccacagcat atgcttaatt cttttcaaca atgaaacttt attaatgatgtgcttgaatc 3480 atagatacta taagtttatg gttgttgtaa aattargttt ctctggctgtctgtgggatt 3540 ttcccagcgc tgttggattt gcgtctttat ctatatttat aagtgaagccattttatata 3600 atctctgaca gtattttatt tagattagaa attaaatact agtgttttttgtcttgtttc 3660 tatagtatta ttactatttt tttgcattaa tttacagaag atgcctgataaactgaattt 3720 agtataataa tttaaatacc aaaacatcat taggtacatt taaaataccaatcatgcaaa 3780 aaaataaccc tttgactgca catttaccca atgggtgtcc atttttgactttttaaataa 3840 tggtttacac acacatcatt gctggtttac aaaaaaatca aacataattcttttgcacga 3900 ctactctgaa ttttggtttc attcattttc tttttggcta agtctgtttattaatatgga 3960 gtcgccacag cggaatgaat cgccaactta tttagcatat gtttcacacagtggatgccc 4020 ttccagctgc aaaccatcac tgggaaacat ccatacacta tgggacaatttagcctaccc 4080 aattcatctg aactgcatgt ctttgcaggg aaacccacac aaacacgggggagaacatgt 4140 ttggtttaat tgtaaaaaaa caaccagaaa gcataataaa tgagaatctcaaatattttt 4200 accgcatact tcaaaaataa agatgattta gtattaaaaa atgttttattttgaatattg 4260 cttttaaata aattggsctt acacttagta tatgtattaa ttccagtacttttaccataa 4320 accgacatat cmaccatttg gtagaggttg atattttaga aatgacgarawgtgttgaaa 4380 aaaatgcatc gagtgtgtag caacattagg arttaagtat tgcaatgcaaaaattgtaag 4440 twaatcaatt agggactaat tawtcgtcaa tttaaattgt tataatttgctactttttct 4500 caaaccacta ggtttcactg attattcagc aaaatgttat tcatcattttcaattttata 4560 tattttaaca tgagcagcat ttttacttta atatatactg cacaaaaaatagttacattg 4620 tgtttttaag cgtttccttt atttatttat ttttttgagc agtatatttttaaaaagtga 4680 gaataaatat gtagctttag ttttacataa ccatatgatg cacttaacgatgatgaaaca 4740 tttcattcat atttggggca ttttattttt acttattttt tttgaaaaaatggacactaa 4800 ctgtggtttt aatatgattt ctatgtaaat aaaatgactt ttggacatttaatttgatgt 4860 acactgtaaa aaaaatccaa ccttaaattt taagttaaat caagttaaccttatcagtac 4920 attgaactta aattatgtta aactgacata aaactgaatg aataacttataaaattaagt 4980 tagaacacca tagattaatg ttacaatgaa ctaaaaactg tcatgactaattgttcatat 5040 ttatattttt acagtgtaga tgtggaacat ccagtctttg tytataaggtcatataggct 5100 aaaatytaat aaaacattta aataggaatt aaaatttttg tttcttaatatttttattgt 5160 aatttcctaa catttactca gtgaaactaa tttcagtttt gattctttcactataatatg 5220 tgtatatatg tgtattataa aaataatttg tgttcaaaat aaaataaaaaaatttgcaca 5280 atcctccact attcatttga actgaactca catgctgtgt cagctagagatctgccatat 5340 aatattcaaa atggaaagcg tggccacccg tatggtagga gtgtccaaaaaaaagtaccc 5400 caaccccacc cattggtgcc ctacaatttc aaatgaacct actagttcccaaagactgaa 5460 ggagataagc aagcaaacag gcggctagtt cactccatga tctgagaatctcctgryact 5520 gataaacgac atcttcaata ctacacttgc aggatccact agt 5563 274811 DNA Artificial Sequence Description of Artificial Sequence; Note =Synthetic Construct 27 atattttggg ttatggctaa aataattaat gtctaaaacgggattacgcg tttttcgtaa 60 agctcaaaga cgcatgtgcc aaaaatagcc ttttattaaattgtttggtt attaaaatat 120 tattcaactt attttacatc catggaaaga gacatggcctcttctatttg acctgcatgt 180 gttaaaacga aatgccaaaa taaagaaaaa aatgtaattcaacatgtaag gctattcaaa 240 aacaatacac aggtacaaaa catatctttg ttaatgaaactaatttacag tttgtttatt 300 aaaacacact ataaatgcca tagaacattt tggagatgcatgcgttatac attgcgtgat 360 ttaacagatc aattaaagtc gtattttgcg ccagcatttcaatgggcata acgacttaat 420 gttttcctct agaatgatta caaatgtgaa agcgaatgtgatgtgattga gttgaagaat 480 tagttttttt tggaatgccc caaggacgca tgcattagcccacctgtgct gtttatttaa 540 atcattgact ccaagagctg tcagccacaa aaggagggcgggcgcgctgt catcacccat 600 cagatttatg actgccacac aatcattttc cgactaaactaacgccatca tcactcagaa 660 caagaacttc atgagtcgca caagacaagt tataataaatgcattacagc gaatgcatgc 720 acaaacgcga gaaccacttt tgctgcaaaa taatgtggattgttggttga aatgaaaact 780 gggtgagatg cttttctttc aatccctgtt atccatgcttcagcagagga caggaggctt 840 gtgactttgc ctgtgcctgt gtctgccccc gagtgccctgtcacaatcta attacccgtg 900 agtaaaggac aataccgctt cagctggtct gtgtcattccccctatatcc cagtgcctgc 960 ttattttcac aaacccttct gcgccgcttt ctgccccctcctgccctctt ttaaccccac 1020 ggagaatgat aaatgcgcgg tgagggaacg aacgggcaaagccatttcac ggcacctgtt 1080 aattaaggga atgattgcct ccatttttcg ctgagctcgtttccagcgtg ctccattatt 1140 tgtgatgcga ttaattgaaa gcgaatgtga catcacaacgaacgtgatgt cattgtcgcc 1200 gtcacacagt agaacgacag agttacataa gaaataaagtctgcatgcat acatttatgc 1260 atggcgtttt aaagaagagc gcacactggg ttagagtcctcggtggggtc agccacttcg 1320 gtaacacccc aagcattcaa tgctaagccc ttaaaaggacagcgtctttt gttctaacat 1380 cgagagcacc gggattacca caggtattta gttcaggtattctctaagaa tatttagccc 1440 taggtgagct gaaccaagag cagtcattag cgctaaaactggctctgatg ggaagggcta 1500 acacacacac acacacacac acacacacac acacacacattataataaat gtaatgtcat 1560 gtttacaaca actccggcag tgatgctgca tattggcggcgtacatacac taaatgtttt 1620 aatgtagtct gtaagactag agaatcagaa attaatttacacagaaatta caaaaataaa 1680 tacatgttta aatagttaat aaacataatt caaatatgtaatgtattatc gtgtatttta 1740 acattaatgg atgaggtggt tcaaatgcat tttgcacaaaataaaatcga agcagcttca 1800 aatcgtaaag ataatagtcg gtagcattga atctgctttaacatttactt ttagcgaagg 1860 ctactttatt aaggaagctc atattaactc ccaatgaatgtctgctattg cacctttttg 1920 aggtgtagac tgtgtaaaat gcatcactgc acagcaaaatcaagcgtcat attatcctgt 1980 acattctaat ttgttggctt caggctgcca gggctctttgtgctgtgtag ggcccctggc 2040 cagattccag tgtgttaaaa agggatttac gcatctgatattgtcacaca ataaggacaa 2100 atagcccgtt tgagcatctt tatacaacca acgctgacagaggttctgcg gtttaagtgc 2160 ttagtgttgc atttgtgctt aaattgattg tttggtgttcaaccctcact ggaaaaaaat 2220 cttttgatgc aaatgggtgc gtttagataa aaagaagcaaagcctagaac taaagcctag 2280 aatttatatt gcactgtaga tgtggatggt tatgggaaagttttttgaga tactgtgggg 2340 cgagtcacgg cgtcagagtg gcggccggta ggggctctaaactcgcgctc caattattgc 2400 ctgtcagtca tcatcgcttt agattagagc atgcggattaaaactcatgc ctttaaataa 2460 taacaacagc gtcaatatta tcaaaaagac acatcacgcttatttaaaat ctacgaaatg 2520 tgttaaagca taatttgtac tactggttga ttgttgtagacctgaaatcc tgtcagatag 2580 aaatgaacta cccggaccac tggtagttaa gtctctcttgtgttatcttt gattgatcca 2640 accagacaag ctagttaaat taataattta taagcgcaaagcgttggtac aagcagttag 2700 agggagaaag gtgagaagaa gcaatacaaa gtagctaaattcacaatgca ttacattgtc 2760 cattttagaa atgaaacacg aggatttaat gttaaatgaatacagagtag ctataatcag 2820 caatacaaag tagctaaatt cagcaataca aagtagctaaattcagcaat acaaagtagc 2880 tatattcagc aatacaaagt agctaaattc agcaatacaaagtagctata ttcagcaata 2940 caaagtagct atattcagca atacaaagta gctaaattcagcaatacaac gtagctatac 3000 tttgtagcta tacactgtat ccattttaga aatgcacacgatgattttct gttaaaaatc 3060 actgctcatt tgaattagat tatttgaatt ggagcttacattgcatgtaa ttagtaagca 3120 aattcggctt aacaaatttg aaacgcgttt ttttttctcgactaaattaa ttaagaaaat 3180 gtattattga tgggtgcaaa cagtaacaat ttattaaaccctctatgcaa atgaggtgtt 3240 cagctgacta acctgcatcc acagtttatc taaacgcttatcaaactaat tggcgacgtt 3300 ctgtctttct gcctgcggtg ggcgagcctg ctgcttgttttgccacgaga taattgtacg 3360 caagaatcaa cgaagctgcc ctaatggcca ccaattggctttatttggac ctgcccatgc 3420 gacctgtcgg cacctccaag agacgggctc gctattaatatgtaaagtga cgtttgatcg 3480 cttgaaacgg catacaaaga cagtgttttc acaagaagaatgtggtgaca actcatttaa 3540 aactattaga cgcgcaagaa caatagcccc caatttagagaccataaaat actcctcccc 3600 aattaatgcc tgaggtgcta ggagttgagt ttgcttgcattaggcacata tctcatgtga 3660 cacttcagtg ttacaggttt tgttgtttta agctaatgttaatggtcagg gaacagctcg 3720 taatcacaat atatatttaa aacaaatgat tattatgaatgcaataggcc aaatcgatat 3780 tcattaatag aatagaggca ttttaataca tttctgcacaattaaaaatt aaatataatc 3840 ctgcaagtct ataattatat tattcacatc atttaatgtcctaaaaataa atttaaaaaa 3900 tagcattagg ctgcaactta gattttaggc ttttctgttagcacttgagt aaaaagacat 3960 cattacacac catcaacgtg aagctctaaa aagggtaaaaagatctcaat aaattgctgc 4020 gctgaatgat gagtctctca gctctctgga tgtggagcagtaggccgaca gtcgccgtgg 4080 catttcggaa agcatgctgt ccgagccaat ggcagtcagcgcgctctgct attggttccc 4140 agggcgctca ctgccagctc gtgtccccgc ccatgttcgtaagatatgga atctactggc 4200 gccagttccg acagtacaca ggcacaattc attaatgagacttctctccg ctttagacag 4260 acgcagagtt ttagggagac tttaacaatc gggctgtggacaatttaaac cagtggcgaa 4320 ttacgaacgt caacaggcat cttgaggatt aacattctttgcgcaggact aacacgggaa 4380 aaataaacgc aggattggag tgctgaaatg caactttgcgccgtgagtac ttcccgatag 4440 ttatttgaaa ttgcgagcat ttaattgagc gatttaattgattgactaca aaagttagcc 4500 tacttatatt aactgaggcg tcgtcgtgtg aattaagatctgtcttgcac tgtgtttaac 4560 gtcaacactg agatgcttct atctgttatt ctcttacaggtgtccctggc cacccttgaa 4620 tgcaaagaag caggacctct acactccttc aaaaataaaagcatgctcag aaagtaaaca 4680 gagcatcgcc acctgaagca ttaagctaac gacagatattttaataatct aacggactat 4740 agtggtgctt tcgggtctgt agtgtcaagt aaacttttccaagcattttc taagcgcgga 4800 cacttgagat g 4811

I claim:
 1. A transgenic fish the cells of which contain an exogenousconstruct, wherein the construct comprises homologous expressionsequences operably linked to a sequence encoding an expression product,wherein the expression product is expressed only in specific celllineages.
 2. The transgenic fish of claim 1 wherein the expressionsequences and the sequence encoding the expression product are notoperably linked in nature.
 3. The transgenic fish of claim 1 wherein theexpression product is heterologous.
 4. The transgenic fish of claim 3wherein the expression product is a reporter protein.
 5. The transgenicfish of claim 4 wherein the reporter protein is selected from the groupconsisting of β-galactosidase, chloramphenicol acetyltransferase, andgreen fluorescent protein.
 6. The transgenic fish of claim 5 wherein thereporter protein is green fluorescent protein.
 7. The transgenic fish ofclaim 1 wherein the fish is selected from the group consisting ofzebrafish, medaka, trout, salmon, carp, tilapia, goldfish, loach, andcatfish.
 8. The transgenic fish of claim 7 wherein the fish iszebrafish.
 9. The transgenic fish of claim 1 wherein the expressionproduct is expressed only in cells selected from the group consisting ofblood cells, nerve cells, and skin cells.
 10. The transgenic fish ofclaim 9 wherein the expression product is expressed only in blood cells.11. The transgenic fish of claim 10 wherein the expression product isexpressed only in erythroid progenitor cells.
 12. The transgenic fish ofclaim 9 wherein the expression product is expressed only in neurons. 13.The transgenic fish of claim 1 wherein the expression sequences areselected from the group consisting of GATA-1 expression sequences andGATA-2 expression sequences.
 14. The transgenic fish of claim 13 whereinthe expression sequences comprise GATA-1 expression sequences.
 15. Thetransgenic fish of claim 13 wherein the expression sequences compriseGATA-2 expression sequences.
 16. The transgenic fish of claim 15 whereinthe expression sequences comprise the GATA-2 promoter operably linked tothe neuron-specific enhancer of GATA-2.
 17. The transgenic fish of claim15 wherein the expression sequences comprise the GATA-2 promoteroperably linked to the blood-specific enhancer of GATA-2.
 18. Thetransgenic fish of claim 15 wherein the expression sequences comprisethe GATA-2 promoter operably linked to the skin-specific enhancer ofGATA-2.
 19. The transgenic fish of claim 1 wherein the transgenic fishdeveloped from, or is the progeny of a transgenic fish developed from,an embryonic cell into which the construct was introduced.
 20. Thetransgenic fish of claim 1 wherein the expression product is expressedonly in predetermined cell lineages.
 21. The transgenic fish of claim 1wherein the exogenous construct is genetically linked to an identifiedmutant gene.
 22. The transgenic fish of claim 1 wherein the expressionsequences comprise a homologous promoter operably linked to a homologousenhancer.
 23. The transgenic fish of claim 22 wherein the expressionsequences further comprise homologous 5′ untranslated sequences operablylinked to the promoter and the sequence encoding the expression product.24. The transgenic fish of claim 1 wherein the construct furthercomprises (a) intron sequences operably linked to the sequence encodingthe expression product, (b) a polyadenylation signal operably linked tothe sequence encoding the expression product, or both.
 25. Cellsisolated from the transgenic fish of claim 1 wherein the cells expressthe expression product.
 26. A method of making transgenic fish, themethod comprising (a) introducing an exogenous construct into anembryonic cell of a first fish, wherein the construct compriseshomologous expression sequences operably linked to a sequence encodingan expression product, and (b) allowing the egg cell or embryonic cellsto develop into a second fish, wherein the expression product isexpressed only in specific cell lineages of the second fish.
 27. Themethod of claim 26 wherein the expression product is expressed only inpredetermined cell lineages.
 28. The method of claim 26 wherein themethod further comprises producing progeny of the second fish.
 29. Themethod of claim 26 wherein the expression sequences and the sequenceencoding the expression product are not operably linked in nature. 30.The method of claim 26, wherein the expression sequences are expressionsequences of a fish gene, wherein the method further comprises (c)exposing the second fish or progeny of the second fish to a testcompound, (d) detecting the expression product in the fish exposed tothe test compound, and (e) comparing the pattern of expression of theexpression product in the fish exposed to the test compound with thepattern of expression of the expression product in the second fish orprogeny of the second fish not exposed to the test compound, wherein ifthe pattern of expression of the expression product in the fish exposedto the test compound differs from the pattern of expression in the fishnot exposed to the test compound, then the test compound affectsexpression of the fish gene.
 31. The method of claim 26, wherein theexpression sequences are expression sequences of a fish gene, whereinthe method further comprises (c) detecting the expression product in thesecond fish or progeny of the second fish, wherein the pattern ofexpression of the expression product in the second, fish or progeny ofthe second fish identifies the pattern of expression of the fish gene.32. The method of claim 26, wherein the expression sequences areexpression sequences of a fish gene, wherein the method furthercomprises (c) crossing the second fish or progeny of the second fish toa third fish having an identified mutant gene to produce a fourth fishhaving both the exogenous construct and the identified mutation, (d)detecting the expression product in the fourth fish or progeny of thefourth fish, and (e) comparing the pattern of expression of theexpression product in the fourth fish or the progeny of the fourth fishwith the pattern of expression of the expression product in the secondfish, wherein if the pattern of expression of the expression product inthe fourth fish or progeny of the fourth fish differs from the patternof expression in the second fish, then the mutant gene affectsexpression of the fish gene.
 33. The method of claim 26, wherein themethod further comprises (c) crossing the second fish or progeny of thesecond fish to a third fish having an identified mutant gene, whereinthe exogenous construct and the mutant gene map to the same region ofthe genome, to produce a fourth fish having both the exogenous constructand the mutant gene, and (d) crossing the fourth fish to a fifth fish,wherein the fifth fish has neither the exogenous construct nor themutant gene, to produce a sixth fish, wherein the sixth fish has boththe exogenous construct and the mutant gene, wherein the mutant gene ismarked by the exogenous construct in the sixth fish.
 34. The method ofclaim 33, wherein the method further comprises (e) crossing the sixthfish, or a progeny of the sixth fish, with a seventh fish, and (f)identifying progeny fish expressing the expression product, wherein fishexpressing the expression product have the mutant gene.
 35. The methodof claim 26, wherein the construct comprises a homologous promoteroperably linked to a sequence encoding an expression product, whereinthe promoter is not operably linked to a enhancer, wherein the methodfurther comprises (c) detecting the expression product in the secondfish or progeny of the second fish, wherein if the expression product isdetected, then the exogenous construct is operably linked to a enhancer.36. The method of claim 35 further comprising (d) isolating the enhancerfrom the second fish or progeny of the second fish.
 37. The method ofclaim 35 further comprising (d) determining the pattern of expression ofthe expression product in the second fish or progeny of the second fish,wherein the pattern of expression of the expression product in thesecond fish or progeny of the second fish identifies the pattern ofexpression of the enhancer.
 38. A method of identifying regulatoryelements in sequences upstream of a gene of interest, the methodcomprising (a) introducing members of a set of exogenous constructs intoseparate embryonic cells, wherein each member of the set of constructscomprises a sequence encoding an expression product operably linked toupstream sequences of a homologous gene of interest, wherein thedifferent members of the set have different regions of the upstreamsequences deleted, (b) allowing the embryonic cells to develop intofish, (c) detecting the expression product in the fish or progeny of thefish, (d) determining which regions of the upstream sequences are neededfor expression of the expression product.
 39. The method of claim 38wherein determining which regions of the upstream sequences are neededfor expression is accomplished by comparing the expression of theexpression product in fish into which different members of the set ofexogenous constructs has been introduced, wherein if the expressionproduct is detected in cells of interest in a fish, then the exogenousconstruct introduced into that fish includes a regulatory element forexpression in the cells of interest, wherein if the expression productis not detected in cells of interest in a fish, then the exogenousconstruct introduced into that fish does not include a regulatoryelement for expression in the cells of interest.
 40. A nucleic acidconstruct comprising expression sequences derived from fish operablylinked to a sequence encoding an expression product, wherein theexpression sequences comprise a promoter operably linked to a enhancer,wherein the expression product is expressed only in specific celllineages.